zeromq / netmq

A 100% native C# implementation of ZeroMQ for .NET
Other
2.93k stars 744 forks source link

Mitigating a FaultException in Mechanism.Encode. #1095

Open sonicpro opened 1 month ago

sonicpro commented 1 month ago

On some circumstances Msg class throws NetMQ.FaultException in Mechanism.Encode method. The details are in https://github.com/zeromq/netmq/issues/1094

closes #1094

follesoe commented 1 month ago

Any chance you could have a look at this PR, @drewnoakes? I know I am not entitled to nag or demand from open-source maintainers, but it would be great to get some updates on the NetMQ core. Right now, I am maintaining my own fork and building from source rather than using the NuGet package to get some of these fixes included.

drewnoakes commented 1 month ago

While this does look like it'd suppress the exception, I'm not sure this is an actual fix. It might just allow the process to continue having skipped data or in some other invalid state, whereafter debugging the problem would probably be harder.

From the linked issue:

I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.

Indeed, looking at YQueue you can see it's not null-annotated, and there are a bunch of expectations around how the type is used. I wonder whether you'd be able to run a version of NetMQ with an implementation of YQueue that validates all nullness guarantees, to see if that's what's really going on. A process dump when the exception is thrown should provide insight into what state the application was in when the failure occurred.

https://stackoverflow.com/a/20238046/24874

A dump can be opened in Visual Studio to analyze the state of the process at the time of the crash. The instance of YQueue, YPipe and so on can be inspected to check if it's in a bad state.

I'm sympathetic to the problem here and want to find a solution that addresses the problem fully. It's just that the problem isn't well understood unfortunately.