planetarium / libplanet

Blockchain in C#/.NET for on-chain, decentralized gaming
https://docs.libplanet.io/
GNU Lesser General Public License v2.1
505 stars 139 forks source link

System.Net.Sockets.SocketException (22) on macOS #2740

Open longfin opened 1 year ago

longfin commented 1 year ago
          is there any cause to `macos-netcore-test` failed with this error?
Passed!  - Failed:     0, Passed:   167, Skipped:     0, Total:   167, Duration: 19 s - /Users/distiller/project/Libplanet.Tests/bin/Release/net6.0/Libplanet.Tests.dll (net6.0)
The active test run was aborted. Reason: Test host process crashed : Unhandled exception. System.Net.Sockets.SocketException (22): Invalid argument
   at System.Net.Sockets.Socket.UpdateStatusAfterSocketErrorAndThrowException(SocketError error, String callerName)
   at NetMQ.Core.Transports.Tcp.TcpListener.InCompleted(SocketError socketError, Int32 bytesTransferred)
   at NetMQ.Core.Utils.Proactor.Loop()
   at System.Threading.Thread.StartCallback()

Results File: /tmp/junit/Libplanet.Net.Tests.xml

Passed!  - Failed:     0, Passed:    26, Skipped:     0, Total:    26, Duration: 26 s - /Users/distiller/project/Libplanet.Net.Tests/bin/Release/net6.0/Libplanet.Net.Tests.dll (net6.0)
Test Run Aborted with error System.Exception: One or more errors occurred.
 ---> System.Exception: Unable to read beyond the end of the stream.
   at System.IO.BinaryReader.Read7BitEncodedInt()
   at System.IO.BinaryReader.ReadString()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.LengthPrefixCommunicationChannel.NotifyDataAvailable()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.TcpClientExtensions.MessageLoopAsync(TcpClient client, ICommunicationChannel channel, Action`1 errorHandler, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---.

Exited with code exit status 1

The test is terminated with those errors in macos-netcore-test with this PR

Originally posted by @riemannulus in https://github.com/planetarium/libplanet/issues/2735#issuecomment-1397165002

longfin commented 1 year ago

SocketException (22) can be thrown from .NET runtime / macOS by the below reasons

At first, I assumed this to be a Linger-related issue, but in that case, the error will occurs on .Accept()...

https://github.com/planetarium/netmq/blob/b111a246ba189ee075d8480cc0e589154110dee4/src/NetMQ/Core/Transports/Tcp/TcpListener.cs#L240

Of course, it is possible that the function has been inlined... but the current estimate may not be accurate.

longfin commented 1 year ago

Setting noDelay to the accepted socket may be a problem. (NetMQ had done a similar fixe 4 years ago).

I don't have confidence that it's a NetMQ side bug yet. but it seems helpful to debug current situation.

longfin commented 1 year ago

https://smartos.org/bugview/OS-6312 Maybe related? 🤔

echo "xnet_skip_checks/W1" | mdb -kw

if this workaround tricks, it seems a timing issue about _router in NetMQTransport.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.