Closed somecoolword closed 4 years ago
Hrm. That's not a good sign. Is this the only stacktrace where the issue shows up, or have you seen it in other areas?
For now, it's the only one. Have caught this exception a couple of times with the same stacktrace.
looks like mine
System.Exception: HasValueSentinel - Should not be invoked.
at System.Runtime.CompilerServices.ResultHolder.<>c.<.cctor>b__4_2()
at System.Runtime.CompilerServices.ResultHolder`1.TryInvoke(Action callback)
at System.Runtime.CompilerServices.ResultHolder`1.TrySetExceptionOrResult(Exception exception, T result)
at System.Runtime.CompilerServices.ResultHolder`1.TrySetResult(T result)
at System.Runtime.CompilerServices.ResultHolder`1.SetResult(T result)
at ReusableTasks.ReusableTaskCompletionSource`1.SetResult(T result)
at MonoTorrent.Client.Connections.SocketConnection.HandleOperationCompleted(Object sender, SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.OnCompleted(SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationSuccess(SocketError socketError, Int32 bytesTransferred, SocketFlags flags)
at System.Net.Sockets.SocketAsyncEventArgs.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
An unhandled exception of type 'System.Exception' occurred in ReusableTasks.dll
edit by alan: I wrapped your stacktrace in ``` quotes so it renders nicely :)
Ah excellent! I'll go review the logic and see what's going on there. That exception is hopefully going to lead to something!
I can't trigger this issue, nor have I spotted anything obviously incorrect yet.
Could you try with this branch and see if either of the two new exceptions are being thrown? I suspect part of the issue could be concurrent sends/receives, but that seems unlikely too. https://github.com/mono/monotorrent/tree/wip-181-182
hi alan , I used your latest code, the exception still throwed. I noticed, if I add the second torrent, the exception nearly throwed immediately and, I use vs2017 version 15.9.17 and set c# version to 7.2 and the console has no sign of InvalidProgramException
Can you share the code you're using to load/start torrents? Is any of that code using ReusableTask or ReusableTaskCompletionSource directly? I can't trigger any issues when loading multiple torrents, or during the time they download.
If your code is public I'd be happy to compile/run it (if I can do so on a desktop) if all else fails!
I am using your code with no single line of change, the exception throwed quite often, I am trying to figure out some pattern suddenly, the exception seems gone, with the same torrents
Oh, i thought you meant the console application did not have the same issue.
Hrm, this is quite odd then. I've downloaded dozens, or hundreds, of torrents and haven't triggered this kind of reuse issue. Could you send me a copy of the binaries compiled with VS 15.9.17? I can try using those locally and see if it's a compilation issue. I doubt that's the case, but it might be nice to rule that out!
bin.zip change the zip to 7z
the exception knock my door again and output window has no sign of InvalidProgramException, only InvalidTaskReuseException I recently update to the latest code, and nearly no single of torrent can be downloaded. the exception throwed soon after I change int maximumHalfOpenConnections = 30;
ReusableTasks.InvalidTaskReuseException: HasValueSentinel - Should not be invoked. at System.Runtime.CompilerServices.ResultHolder.<>c.<.cctor>b__4_2() at System.Runtime.CompilerServices.ResultHolder
1.TryInvoke(Action callback)
at System.Runtime.CompilerServices.ResultHolder1.TrySetExceptionOrResult(Exception exception, T result) at System.Runtime.CompilerServices.ResultHolder
1.TrySetResult(T result)
at System.Runtime.CompilerServices.ResultHolder1.SetResult(T result) at ReusableTasks.ReusableTaskCompletionSource
1.SetResult(T result)
at MonoTorrent.Client.Connections.SocketConnection.HandleOperationCompleted(Object sender, SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.OnCompleted(SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationSuccess(SocketError socketError, Int32 bytesTransferred, SocketFlags flags)
at System.Net.Sockets.SocketAsyncEventArgs.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped pOVERLAP)`
I think i have a lead on the issue!
Hopefully it's this: https://github.com/alanmcgovern/ReusableTasks/releases/tag/release-v1.0.5 :)
When you have a rough idea where the issue might be, a really terrible stress test can reveal the problem https://github.com/alanmcgovern/ReusableTasks/commit/46e2183db35fb36154a0d4c9be986e00187a9928#diff-3bc5e60f74af9bbaf085989fd17212eeR103-R115 :)
If the issue hasn't been resolved please reopen this and i'll look some more! I clearly need a faster computer to properly test these things :P
well, I do have a desktop with 8 cores, that explains a lot . and you remind me of the hell like experiences when I was programming the multi threads app using the old technology I will test the fix code, and update the result
Got the same System.InvalidOperationException: A result has already been set on this object
error but in a different place with the latest ReusableTask
Calling this
NetworkIO.SendAsync (MonoTorrent.Client.Connections.IConnection2 connection, System.Byte[] buffer, System.Int32 offset, System.Int32 count, MonoTorrent.Client.RateLimiters.IRateLimiter rateLimiter, MonoTorrent.SpeedMonitor peerMonitor, MonoTorrent.SpeedMonitor managerMonitor)
Got this
ResultHolder`1[T].SetResult (T result)
ReusableTaskMethodBuilder.SetResult ()
NetworkIO.SendAsync (MonoTorrent.Client.Connections.IConnection2 connection, System.Byte[] buffer, System.Int32 offset, System.Int32 count, MonoTorrent.Client.RateLimiters.IRateLimiter rateLimiter, MonoTorrent.SpeedMonitor peerMonitor, MonoTorrent.SpeedMonitor managerMonitor)
StateMachineCache`1[TStateMachine].OnCompleted ()
ResultHolder`1[T].TryInvoke (System.Action callback)
ResultHolder`1[T].TrySetExceptionOrResult (System.Exception exception, T result)
ResultHolder`1[T].TrySetResult (T result)
ResultHolder`1[T].SetResult (T result)
ReusableTaskCompletionSource`1[T].SetResult (T result)
SocketConnection.HandleOperationCompleted (System.Object sender, System.Net.Sockets.SocketAsyncEventArgs e)
SocketAsyncEventArgs.OnCompleted (System.Net.Sockets.SocketAsyncEventArgs e)
SocketAsyncEventArgs.Complete ()
<.cctor>b__318_12 (System.IAsyncResult ares)
SocketAsyncResult+<>c.<Complete>b__27_0 (System.Object state)
IThreadPoolWorkItem.ExecuteWorkItem ()
ThreadPoolWorkQueue.Dispatch ()
_ThreadPoolWaitCallback.PerformWaitCallback ()
Just to be 100% sure, you're using ReusableTasks 1.0.6?
I just checked myself. This error happened on November 30th, the latest version at that time was 1.0.5 So, the error seems to be related to the previous version. Right now I'm using 1.0.6 version. Such error wasn't discovered after the 2nd of December, the day the latest version of ReusableTasks was released. I will report if this happens again with the latest version.
Perfect! Hopefully 1.0.6 continues to be error free :)
I'm getting the same exception in a newer version. I'm on MonoTorrent 1.0.19 and double checked that ReusableTasks is 1.0.6.
System.InvalidOperationException: A result has already been set on this object
at System.Runtime.CompilerServices.ResultHolder`1.SetResult(T result)
at MonoTorrent.Client.Connections.SocketConnection.HandleOperationCompleted(Object sender, SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.OnCompleted(SocketAsyncEventArgs e)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationSuccess(SocketError socketError, Int32 bytesTransferred, SocketFlags flags)
at System.Net.Sockets.SocketAsyncEventArgs.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
Unfortunately, it doesn't seem to repro consistently. What other info can I provide?
This is unfortunate and annoying!
I wonder if there's a bug or race condition in SocketConnection itself which is causing this. I've reviewed ReusableTask a few times and it seems correct and race condition free. This exception only seems to show up in SocketConnection.
Realistically there's little you can do to help resolve this. A repro is unlikely to be easy as it's a rare race condition.
There are two things I can do:
1) catch this exception so it doesn't have any I'll effects other than causing one socket/connection to die.
2) investigate SocketConnection itself to see if there an issue there. I find that unlikely as it's fairly simple code.
@borigas while there is no repro for this issue - how frequently do you see it and which .net runtime are you seeing it on?
I could build a special version of MonoTorrent with extra logging if you can reproduce this
It's rare. None of our test machines have reproduced it. We've only seen it a handful of times in prod.
We're on Framework 4.6.1
Interesting. I don't build releases for that framework version so I'm kind surprised to see it working! Are you self-building MonoTorrent and targeting 4.6.1 directly?
I can make a moderately inefficient change in a branch which should let us determine whether it's the callback being invoked twice, or if it's a bug where I really am incorrectly setting a result on the same ReusableTask object from two threads concurrently. It's also possible there's a bug in ReusableTask itself, though at this stage it's hard to figure out which is most likely.
We're using the published NuGet package. .NET 4.6.1 supports .NET Standard up to 2.0 by including some extra assemblies in the bin folder
In a past job we couldn't use those packages because of versioning concerns with the extension system. We could only use NS2 when the core upgraded to depend on .NET 4.7.2 so I'd forgotten they existed!
@borigas The ReusableTaskCompletionSource object used for connecting is not reused. As such the odds of the issue being related to MonoTorrent or ReusableTask itself are quite slim.
As odd as it seems, I could find another report of a similar issue ( https://github.com/Azure/amqpnetlite/issues/327 ). While I really doubt this is the issue we're seeing here, it's worth ruling it out.
I added some code which should throw a different exception if the completion source is seen twice. Can you either compile this branch locally and test with it - https://github.com/alanmcgovern/monotorrent/tree/diagnose-task-issue or use the nupkg files i've attached in this zip:
If you see this exception in your logs throw new Exception ("The completion delegate has been invoked twice");
then I can fix this trivially. If you continue to see the old exception then we've successfully ruled out the possibility that SocketAsyncEventArgs is invoking the Completed handler twice in some cases.
NOTE: This introduces a permanent memory leak in the region of about 80 bytes per connection attempt. That should be low enough to not be a problem no matter how long you run the client for.
Sure, I'll give that a try and let you know when I see either exception. Thanks for your help!
two days ago, I use the latest code and download 5 torrents, everything is fine, no exception throwed. the test last over ten hours. and on Nov 2019, I switched to netCore3.0, the previous platform is netFramework4.7.2
Sure, I'll give that a try and let you know when I see either exception. Thanks for your help!
@borigas it's probably too early to tell, but have you been able to reproduce any issues with the new custom build?
Thanks for the reminder. I hadn't checked because I assumed it was too early, but it looks like happened on 1 client shortly after updating. Here's the stacktrace:
System.InvalidOperationException: A result has already been set on this object
at System.Runtime.CompilerServices.ResultHolder`1.SetResult(T result)
at MonoTorrent.Client.Connections.SocketConnection.HandleOperationCompleted(Object sender, SocketAsyncEventArgs e)
at System.Net.Sockets.SocketAsyncEventArgs.OnCompleted(SocketAsyncEventArgs e)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationSuccess(SocketError socketError, Int32 bytesTransferred, SocketFlags flags)
at System.Net.Sockets.SocketAsyncEventArgs.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
This is interesting. I think i need to add more changes.
For some reason i thought this was happening during ConnectAsync
. The changes i made would've caused a different exception to be thrown if that were the case ( https://github.com/alanmcgovern/monotorrent/commit/803f1d081cf72c77417dd72ab48f89fd864bdb13#diff-ab68f0bff1f2614bfe766bc80510470aR153 ). As i'm not seeing the new exception type, the most likely explanation is that this is occurring during a ReceiveAsync
or SendAsync
. After re-reading the thread, i don't know why i thought this was specific to ConnectAsync
in the first place.
I'll change the code around a little more and see if we can pinpoint what's causing this!
Ok - so i have some new diagnostics in there. Unfortunately you will have to make a code change to dump the information.
I added two new static methods on SocketConnection:
static string[] DoubleReceiveStacktraces { get; }
static string[] DoubleSendStacktraces { get; }
These are both threadsafe. Can you query those repeatedly (maybe on a timer and do it every few seconds?) and if they are ever non-empty, dump the content of the array to disk and send it to me? It'll contain the stacktrace (which should be more or less guaranteed to only contain monotorrent specific information) along with some boolean data so i know if the issue occurred before or after the socket was disposed.
The new logging should tell me if there's a bug in the SendAsync
or ReceiveAsync
codepaths. I left the other logging in place to let me know if there's a bug in the ConnectAsync
codepath.
I hope you'll either see stacktraces showing up in the new two properties, or you'll see the throw new Exception ("The completion delegate has been invoked twice");
logging showing up. If none of that occurs then I'm really going to have to think :p
I attached a pre-compiled nupkg to this comment, but you can also compile from source and/or review the diagnostics I added by checking out this commit: https://github.com/alanmcgovern/monotorrent/commit/f313442e516bc92dd5645d4f5eb613b6717d6c66
@borigas Let me know what happens with those changes :) Hopefully its enough to pinpoint the issue.
Thanks @alanmcgovern! I'll give that a try and let you know what I see
@borigas Just wanted to check in again and see if the new logging has triggered. Have those properties ever been non-empty for you?
@alanmcgovern not yet. The original exception also hasn't reproduced on this build. Only a few of our clients have updated though, so it doesn't shock me. I'll keep checking
@borigas Thanks for putting up with my followups! And yes, this is another follow up ;) I was curious if you have seen this issue yet. If not, i'll check back in in another week or two!
@alanmcgovern thanks for taking an interest in finding a fix. Not many maintainers care like this. I've really appreciated it.
It hasn't reproduced yet, but it'll still running on a pretty small subset of our clients. We should have a better sample size soon
The new logging hit this morning. DoubleSendStackTrace contained:
Inc - Disposed: False - at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at MonoTorrent.Client.Connections.SocketConnection.<SendAsync>d__57.MoveNext()
at MonoTorrent.Client.Connections.SocketConnection.SendAsync(Byte[] buffer, Int32 offset, Int32 count, Task delayCompletion)
at MonoTorrent.Client.Connections.SocketConnection.SendAsync(Byte[] buffer, Int32 offset, Int32 count)
at MonoTorrent.Client.NetworkIO.<SendAsync>d__12.MoveNext()
at System.Runtime.CompilerServices.StateMachineCache`1.OnCompleted()
at MonoTorrent.Client.MainLoop.Loop()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
and
Dec - Disposed: False - at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at MonoTorrent.Client.Connections.SocketConnection.<SendAsync>d__57.MoveNext()
at MonoTorrent.Client.Connections.SocketConnection.SendAsync(Byte[] buffer, Int32 offset, Int32 count, Task delayCompletion)
at MonoTorrent.Client.Connections.SocketConnection.SendAsync(Byte[] buffer, Int32 offset, Int32 count)
at MonoTorrent.Client.NetworkIO.<SendAsync>d__12.MoveNext()
at System.Runtime.CompilerServices.StateMachineCache`1.OnCompleted()
at MonoTorrent.Client.MainLoop.Loop()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
It's great that the issue has been slightly narrowed down. However it's frustrating in that the async stacktrace isn't quite enough to figure out the issue. There are few places which call either of those methods and they both look fairly safe. From review I just can't tell how we could call SendAsync twice on the same connection. My best guess at the moment is that either there's a fundamental race condition in ReusableTask (unlikely?) or i'm just blind and am missing an obvious racey double-send.
I will add similar logging to two more places in the code (PeerIO.SendAsync
and NetworkIO.SendAsync
) and that should give us insight into which caller is causing the problem.
@borigas Hopefully this long saga will come to an end soon!
Can you try this commit https://github.com/alanmcgovern/monotorrent/commit/54a6fe6f6fa126b952fff143aa7a0a663f6b6815
For your convenience I've attached precompiled binaries. The testing procedure is the same as previously, please repeatedly check the DoubleSendStackTrace
property and log it if it's ever non-null.
There are now two other ways for the stacktrace to be populated. Please keep running the client until there is a entry which looks different to the stacktrace you just supplied. I'm capturing the stack at two additional points, at least one of which should hopefully contain the true culprit!
monotorrent-1.0.39-alpha-special-debug.zip
EDIT: Thanks for bearing with me. I feel we're slowly narrowing in on the issue :)
@alanmcgovern no problem. Again, I appreciate your help debugging this.
Sorry for the delay, but I just got around to updating to the latest build. I'll get back when we notice a new log message. Thanks for your help
@borigas I just landed a fairly large refactor which changes some of the async code in the library. If you're still seeing this issue showing up at a low rate with the revision you're on, would you be interested in trying a pre-release build from master to see if the issue has been addressed? This build is a significant API break from the 1.x series, and the API is continuing to evolve as I land more work over the next few weeks.
I've added a new 'stress test' application which uses about 1GB of RAM and initialises 16 clients to do in-memory transfers for stress testing/profiling purposes. When running this stress test I just cannot trigger this issue, and I've 'transferred' terrabytes of data at this point.
If you clone/build master you can run the stress test by compiling/running 'SampleClient' with this patch applied:
diff --git a/src/SampleClient/main.cs b/src/SampleClient/main.cs
index 635eb8ea..46a9547c 100644
--- a/src/SampleClient/main.cs
+++ b/src/SampleClient/main.cs
@@ -30,8 +30,8 @@ namespace SampleClient
{
// Uncomment this to run the stress test
//
- //var tester = new StressTest ();
- //tester.RunAsync ().Wait ();
+ var tester = new StressTest ();
+ tester.RunAsync ().Wait ();
/* Generate the paths to the folder we will save .torrent files to and where we download files to */
basePath = Environment.CurrentDirectory; // This is the directory we are currently in
@alanmcgovern thanks for the update. If you think it may already be fixed, I don't think we need to pursue this in the existing version. Unfortunately, I'm pretty swamped for awhile so I can't commit to timing on testing. I propose closing this for now and circling back if it repros in the new version. Thanks again for all your help trying to track this down
Sounds good to me!
Thanks, and do let me know if it recurs with a master based build when you get around to trying it.
It'll be versioned 1.9.x or higher.
I still haven't had a chance to update and still agree with the above plan of updating to the refactored version, but I noticed this in my logs today while looking at something else. I wanted to dump the newest logging output here for the sake of completeness.
DoubleSendStackTrace contained
at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at MonoTorrent.Client.Connections.SocketConnection.set_NetworkIOSend(Boolean value)
at MonoTorrent.Client.NetworkIO.<SendAsync>d__12.MoveNext()
at MonoTorrent.Client.NetworkIO.SendAsync(IConnection2 connection, Byte[] buffer, Int32 offset, Int32 count, IRateLimiter rateLimiter, SpeedMonitor peerMonitor, SpeedMonitor managerMonitor)
at MonoTorrent.Client.Encryption.PeerAEncryption.<DoneReceiveY>d__5.MoveNext()
at MonoTorrent.Client.Encryption.PeerAEncryption.DoneReceiveY()
at MonoTorrent.Client.Encryption.EncryptedSocket.<ReceiveY>d__37.MoveNext()
at System.Runtime.CompilerServices.StateMachineCache`1.OnCompleted()
at System.Runtime.CompilerServices.ResultHolder`1.TrySetExceptionOrResult(Exception exception, T result)
at System.Runtime.CompilerServices.ResultHolder`1.SetResult(T result)
at System.Runtime.CompilerServices.ReusableTaskMethodBuilder.SetResult()
at MonoTorrent.Client.Encryption.EncryptedSocket.<ReceiveMessage>d__41.MoveNext()
at System.Runtime.CompilerServices.StateMachineCache`1.OnCompleted()
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
and
at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at MonoTorrent.Client.Connections.SocketConnection.set_NetworkIOSend(Boolean value)
at MonoTorrent.Client.NetworkIO.<SendAsync>d__12.MoveNext()
at System.Runtime.CompilerServices.StateMachineCache`1.OnCompleted()
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
Hah, brilliant! That was enough to discover one root cause! Patch incoming!
@borigas that particular one is fixed by https://github.com/alanmcgovern/monotorrent/commit/86dc9a45bef3681a6ee7e33144eb1a568b27a47b and can be backported relatively easily.
Thanks for following up with the stacktraces. They were a bit of a long shot considering 'async' code rarely has a good stacktrace, but it worked out in the end!
Fix backported to the -1.0 branch too. I'll issue a release in the not too distant future with this.
Some times this exception thrown for ReusableTaskMethodBuilder.SetResult() method