zeromq / netmq

A 100% native C# implementation of ZeroMQ for .NET
Other
2.93k stars 744 forks source link

Process crash due to an unhandled exception in Mechanism.Encode #1094

Open sonicpro opened 1 month ago

sonicpro commented 1 month ago

related #1054

Environment

NetMQ Version:     4.0.1.13
Operating System:  Windows Server 2019 Standard 64bit
.NET Version:     .NET 6

We are using a component that creates a DealerSocket and using that component in a Windows service. The socket is listening on tcp address 0.0.0.0 and port 5556. It does so by issuing socket.TryReceiveMultipartMessage(ref clientMessage, 0) where clientMessage variable is of type of NetMQMessage and equals null.

Expected behaviour

The NetMQ socket that receives the data from the queue should not crash the process.

Actual behaviour

Approximately once a day the Windows service process crashes due to an unhandled exception. The Exception type, message and the call stack is like the following:

Application: <wiped out by NDA>
CoreCLR Version: 6.0.1623.17311
.NET Version: 6.0.16
Description: The process was terminated due to an unhandled exception.
Exception Info: NetMQ.FaultException: Cannot close an uninitialised Msg.
   at NetMQ.Msg.Close() in /_/src/NetMQ/Msg.cs:line 453
   at NetMQ.Core.Mechanisms.CurveMechanismBase.Encode(Msg& msg) in /_/src/NetMQ/Core/Mechanisms/CurveMechanismBase.cs:line 66
   at NetMQ.Core.Transports.StreamEngine.PullAndEncode(Msg& msg) in /_/src/NetMQ/Core/Transports/StreamEngine.cs:line 1226
   at NetMQ.Core.Transports.StreamEngine.BeginSending() in /_/src/NetMQ/Core/Transports/StreamEngine.cs:line 445
   at NetMQ.Core.Transports.StreamEngine.Handle(Action action, SocketError socketError, Int32 bytesTransferred) in /_/src/NetMQ/Core/Transports/StreamEngine.cs:line 416
   at NetMQ.Core.Transports.StreamEngine.FeedAction(Action action, SocketError socketError, Int32 bytesTransferred) in /_/src/NetMQ/Core/Transports/StreamEngine.cs:line 333
   at NetMQ.Core.Transports.StreamEngine.OutCompleted(SocketError socketError, Int32 bytesTransferred) in /_/src/NetMQ/Core/Transports/StreamEngine.cs:line 1023
   at NetMQ.Core.IOObject.OutCompleted(SocketError socketError, Int32 bytesTransferred) in /_/src/NetMQ/Core/IOObject.cs:line 119
   at NetMQ.Core.Utils.Proactor.Loop() in /_/src/NetMQ/Core/Utils/Proactor.cs:line 130
   at System.Threading.Thread.StartCallback()

This call stack corresponds to the code exactly because it is done in non-optimized version of NetMQ.dll.

Steps to reproduce the behaviour

The behavior is quite sporadic. The issue arises only approximately once a day. It is worth mentioning that in the same process we have another DealerSocket that sends messages to tcp address 127.0.0.1 and the port 5556.

While looking into the code it seems that on some circumstances StreamEngine recieves from the Pipe non-initialized incoming message. When it tries to pass it to Mechanism.Encode method from it's PullAndEncode() method, the non-initialized message throws an exception. I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.

I am going to mitigate the issue by adding an additional check to NetMQ.Core.SessionBase.PullMsg(ref Msg msg). Could you please consider merging this to the repo?