zeromq / jzmq

Java binding for ZeroMQ
http://www.zeromq.org
GNU General Public License v3.0
590 stars 364 forks source link

Closing ZContext with already closed child socket causes UnknownError (0x80) #461

Open jcoeltjen opened 7 years ago

jcoeltjen commented 7 years ago

When trying to close a ZContext with an associated socket that has already been closed a ZMQException is thrown. The given snippet works fine with JeroMQ but crashes when using JZMQ.

Expected Behavior

ZContext should be closed without exception.

Current Behavior

org.zeromq.ZMQException: Unknown error
(0x80)
    at org.zeromq.ZMQ$Socket.setLongSockopt(Native Method)
    at org.zeromq.ZMQ$Socket.setLinger(ZMQ.java:989)
    at org.zeromq.ZContext.destroySocket(ZContext.java:105)
    at org.zeromq.ZContext.destroy(ZContext.java:67)
    at org.zeromq.ZContext.close(ZContext.java:195)

Possible Solution

Only try to destroy a socket that has not been destroyed previously.

Steps to Reproduce

ZContext context = new ZContext();
ZMQ.Socket socket = context.createSocket(ZMQ.REQ);
socket.close();
context.close(); //this will crash
trevorbernard commented 7 years ago

The given snippet works fine with JeroMQ but crashes when using JZMQ.

Do you mean the opposite instead?

jcoeltjen commented 7 years ago

No the four lines work with JeroMQ (master) but not with JZMQ (master).

The environment is Windows 7 x64 and the version I used the the current master. I think think is caused by the fact that the ZContext closes all Sockets when its close/destroy-method is called without checking if the sockets need to be closed. I don't know if there is an easy way to check whether the sockets has already been destructed or not.

This is not a critical bug because of the easy workaround of just not closing any sockets explicitly and only use the context to clean up. Maybe the error is not even located in JZMQ but libzmq.

sigiesec commented 7 years ago

I don't think there is an error in libzmq. libzmq does not allow calling zmq_close multiple times. This was just clarified in the docs: https://github.com/zeromq/libzmq/pull/2792/files

jcoeltjen commented 7 years ago

I think this is exactly the problem. JZMQ only passes the close() to the underlying libzmq library.

As mentioned in the documentation you referenced the behavior of libzmq is undefined when zmq_close() is called more than once. Maybe this results in the error shown above.

I think we have three possibilities here:

  1. Leave everything as it is and document that closing sockets manually should be avoided when using ZContext.
  2. Prevent closing sockets twice by somehow determining the current state of a socket before closing. Or even actively listen for any changes made to sockets (e.g. via signals).
  3. Define the behavior in libzmq for closing sockets more than once. This could be as simple as throwing an exception or a more sophisticated approach. But without knowing the libzmq internals I think we would still have the problem, that we have no way of determining of a socket is closed or not. (Correct me if I am wrong here!).

Either way closing a socket more than once should not throw an UnknownError.

sigiesec commented 7 years ago

Option 3 is not possible. zmq_close ultimately deletes the socket object (asynchronously). Accessing a deleted object always leads to undefined behaviour in "native land".

sigiesec commented 7 years ago

I don't think any of the options 1-3 is a good idea.

I suggest the following

  1. Remove ZContext.destroySocket (and the list of sockets within a ZContext, which is redundant IMO)
  2. (Maybe this can be viewed as a variant of option 2) Change ZContext.createSocket to return an object that removes itself from the list of sockets within the ZContext when it is closed.
hoditohod commented 5 years ago

Hi Guys, I just got the same error: Unknown error (0x80) in a different context. Do you have any clue where 0x80 comes from? As I see JZMQ code ZMQException is initialized with the zmq_errno and zmq_strerror results, but I don't see any 0x80 errno in libzmq or jzmq code anywhere.

0x80 doesn't seem to be anything related to libzmq neither on linux (EKEYREVOKED) or windows (STRUNCATE). Any idea?