zeromq / netmq

A 100% native C# implementation of ZeroMQ for .NET
Other
2.95k stars 744 forks source link

NetMQ.Core.Pipe.ProcessPipeTermAck(): Object reference not set to an instance of an object. #865

Closed leguanjoe closed 4 years ago

leguanjoe commented 4 years ago

Environment

NetMQ Version:    4.0.0.207
Operating System: Windows 10 Pro
.NET Version:     4.7.1

Expected behaviour

Should not crash.

Actual behaviour

We sometimes see "Object reference not set to an instance of an object." Exceptions after a long runtime.

Steps to reproduce the behaviour

In our test lab, we periodically record exceptions connected to NetMq after several hours of runtime - those are unreproducible, but occur every now and then, e.g.:

Within the same second, two exceptions were recorded in the log files after more than 12 hours of runtime:

1) "Object reference not set to an instance of an object." (no stack trace available)

   SubscriberSocket _subSocket;
    while(!_closing)
    {
        // => there are no objects involved in our code which could be "not set to an instance" 
        // => exception comes from somewhere inside _subSocket.ReceiveFrameString
        try
        {
            _subSocket.ReceiveFrameString(out isMore); //topic
            localHeader = isMore
                ? DeserializeProto(_subSocket.ReceiveFrameBytes(out isMore))
                : null;
            data = isMore ? _subSocket.ReceiveFrameBytes() : null;
        }
        catch (ThreadAbortException)
        {
            return;
        }
        catch (Exception e)
        {
            // Write "e.Message" to log file... next version will also record stack trace :-)
            continue;
        }
    ....
    }

2) Unhandled exception (application crashed afterwards; we do not know in which code section it happened, our code was not referenced in the stack trace):

Exception Message: Object reference not set to an instance of an object. Exception Stack Trace: at NetMQ.Core.Pipe.ProcessPipeTermAck() at NetMQ.Core.ZObject.ProcessCommand(Command cmd) at NetMQ.Core.IOThread.Ready() at NetMQ.Core.Utils.Proactor.Loop() at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart()

In other discussions we found that multi-threading could be an issue. We tried to simulate create a process running with a test program starting several threads opening several sockets in parallel. Unfortunatly, we were not able to reproduce the issue in this way.

Maybe the above exception message already points to an issue in NetMq? Any further advice what we could try to find the error is appreciated.

somdoron commented 4 years ago

This exception is happening when using a socket from multiple sockets, which is not supported. Specifically this happens when the peer get disconnected, so to reproduce you can try to use the socket from multiple threads and make socket come and go.

The good news is that I'm working on porting thread safe sockets from zeromq

leguanjoe commented 4 years ago

Thanks somdoron. We sent from one thread, but closed the socket from another thread when closing the module. This is now fixed on our side. Great to hear you are working on porting thread safe sockets.