pebbe / zmq4

A Go interface to ZeroMQ version 4
BSD 2-Clause "Simplified" License
1.17k stars 163 forks source link

Drop `defaultCtx` creation on `init` to prevent unrecoverable panics on lib inclusion #104

Closed HugoMFernandes closed 7 years ago

HugoMFernandes commented 7 years ago

Like most go libraries (including core libs), zmq4 uses package globals to provide a simpler interface to users who don't necessarily need (or care) about all the details to use the library (e.g. small/pet projects). In the case of zmq4, a default ZMQ context (defaultCtx) is created on init, enabling users to simply create sockets and send messages without having to explicitly create a context.

While I'm fine with this (fine as in I understand why it exists even though I would personally never want to use it in production), this sort of functionality shouldn't break the "real" interface (i.e. the interface for the users who do need/care about the details and failure scenarios). In the case of zmq4, the simpler interface is implemented in a way that can break the runtime even if the client code never uses the default context.

Basically, zmq4.go's init (https://github.com/pebbe/zmq4/blob/7157c0d6df4e6bfcae60a6f20cffaee88ac1ab30/zmq4.go#L93) can panic on lines 99, 103 and 106. Since go programs can't recover from panics raised in the init function of its (direct or indirect) dependencies, a go program that wants to include zmq4 as a soft dependency can never truly do so (as the init function can forcibly turn it into a hard dependency).

Would removing the init block in favor of an explicit func GetDefaultCtx() (*Context, error) (or similar) be up for consideration? Essentially, the required changes would be:

  1. Move all validation logic on init to NewContext, throwing an error (instead of panicking) if needed
  2. Create GetDefaultCtx which would invoke NewContext on the first execution to initialize the global defaultCtx and simply retrieve it on later executions
  3. Remove the package NewSocket, forcing it to always be invoked on an explicit Context

The suggested change would simply force clients to invoke GetDefaultCtx (and handle the possible errors) before invoking NewSocket (which would now be invoked on the retrieved Context). I'm aware that this is a breaking change, but it is a very small one (with significant benefit).

Would a PR for this be considered?

P.S.: This applies to zmq2 and zmq3 as well. I just raised this here as it's the latest version.

pebbe commented 7 years ago

Can you explain what you mean with a soft dependency?

I don't want to change the API. I think I can probably fix the first panic by not creating the default context until the moment it is used the first time.

But what about the other two? You can't compile the package against one library version, and use it with another version. There are incompatabilities between versions exposed through the header file of ZeroMQ.

pebbe commented 7 years ago

PS: Please wait with a PR. Let me see first what I can do.

pebbe commented 7 years ago

I fixed things. The API is unchanged.

HugoMFernandes commented 7 years ago

Hi Peter,

By soft dependency I meant including the package even though you might not need it for your program to function properly (just optimally). It's probably easier if I give you an example:

(Just received the comment while writing this, so didn't get to comment on the suggestions in time, sorry).

Thank you for looking into this so quickly, I'll have a look at the changes :)

HugoMFernandes commented 7 years ago

Just for completeness' sake: the reason why I suggested breaking the API to include a GetDefaultCtx and handle all the panics as an error there (including the version checks) was because there really isn't any way of fixing both use cases properly without breaking the API.

I understand your changes since you don't want to break the API, but I can't say I agree that the API remains totally unchanged with this change. Client code will definitely not break during compile time as all function signatures remain the same, but the general contract of the API did change:

  1. The contract of several functions changed since they now return new (untyped) errors (for conditions that couldn't even be considered before this change)

  2. Some inconsistencies were created in the API (e.g. NewSocket now behaves differently when invoked globally than when invoked via an explicitly created Context since the global version now has to include version validation as well)

  3. Programs that depend on the panics to stop execution immediately if ZMQ can't be used at all will potentially need to be updated to handle these error conditions explicitly (or risk executing code that was previously never executed)

Anyways, I'm just putting this out here in case you want to consider this in the future or add an entry to the README. I'm aware that these problems mostly affect users of the global API (who likely don't care and couldn't handle them either way), so the end result is still much better.

Thanks for looking into this.

pebbe commented 7 years ago

I think if people check for all errors, there shouldn't be any problem?

I reopen this issue. It might get comments from other users.

HugoMFernandes commented 7 years ago

Yes. If people check for errors, there shouldn't be any problem (the new errors would just bubble up and fall in the generic error handling in the client code).

The only case in which there could be a problem even if they check for errors would be in the third case, but that would only happen in very badly designed programs (that were relying on the panics in the init function to die immediately).

Think you can safely close the issue if you're happy with that.