jaegertracing / jaeger-client-csharp

🛑 This library is DEPRECATED!
https://jaegertracing.io/
Apache License 2.0
302 stars 67 forks source link

Fix Lost Spans on Close #182

Closed phxnsharp closed 4 years ago

phxnsharp commented 4 years ago

Which problem is this PR solving?

Fixes #181

Short description of the changes

Changes InMemorySender to reproduce the lost spans problem. Then fixed the lost spans problem. A number of minor mistakes in the threading was causing both thread pool exhaustion through long lived calls on the thread pool, and thread leaks from previous tests.

phxnsharp commented 4 years ago

I would prefer keeping the ProcessQueueLoop async if possible and checking if the executed command can be made non-blocking and also really async. This should also improve the SDK as a whole.

Using SemaphoreSlim to make this completely asynchronous should solve the problem.

Thanks for your review! This is great stuff! Your points are totally valid. Let me clarify some of what I discovered while writing this:

One "middle of the road" approach would be to encapsulate the concept of WaitAsync into a new ManualResetEventSlim extension method. For now it can use my long running Task solution. If an appropriate third party or Microsoft provided solution becomes available it would be easy to swap in.

phxnsharp commented 4 years ago

I'm more than happy to get on a call at some point if it would help our understanding of each other! Just let me know.

phxnsharp commented 4 years ago

If you want to use it, here is an AsyncManualResetEvent. I have no idea if it is "good". https://github.com/StephenCleary/AsyncEx/wiki/AsyncManualResetEvent

phxnsharp commented 4 years ago

The same package has a potential candidate for an async BlockingCollection: https://github.com/StephenCleary/AsyncEx/wiki/AsyncCollection

That package is MIT license. You are Apache-2. I don't know what your rules are for incorporation.

Falco20019 commented 4 years ago

Thanks for your input. I just recently used ConcurrentQueue<T> as asynchronous blocking collection. According to https://apisof.net/catalog/System.Collections.Concurrent.ConcurrentQueue%3CT%3E it also is available for our targets. So this might also help. So this might solve some of our problems too! I need some more time the next days to look into it. If you have time, feel free to check if it would help us. As it's part of the framework, it would also not create any licensing topics to analyze.

phxnsharp commented 4 years ago

I just recently used ConcurrentQueue<T>

BlockingCollection is actually built on top of ConcurrentQueue: https://referencesource.microsoft.com/#system/sys/system/collections/concurrent/BlockingCollection.cs. It uses SemaphoreSlim to implement the blocking aspects, so it potentially could be turned into an Async version with relatively little pain. Except the SemaphoreSlim objects are private so the only way to implement it would be to copy-pasta the whole thing and modify :(

phxnsharp commented 4 years ago

@Falco20019 This is serious issue for us as it occurs very frequently and looks really bad to the user. My patch, while not the ultimate architecturally correct solution, does not make anything worse than it already was. In fact, the solution given prevents blocking calls on thread pool threads in a way that is easy to understand and won't be hard to change later when an asynchronous blocking collection is created or found. Our next release is planned for later this year. Is there any chance this could be merged in time for us to use it for then? Thanks and sorry to be a pain!

Falco20019 commented 4 years ago

@phxnsharp Sorry for the delays. I‘m currently busy with some other work that I need to get done by the end of this week. I try to give it a look as soon as possible. There is another ticket at #185 that requires a 0.4.1, so I will move this to that version too.

Falco20019 commented 4 years ago

Thanks for your patience. I decided to merge this work-around to resolve your issue for now. I want to give it another look before 1.0 though.

You can reference the new version released as 0.4.1

phxnsharp commented 4 years ago

Thank you!

phxnsharp commented 4 years ago

@Falco20019 I have not tried these, but ran across a couple of potential nice solutions to fix this the right way:

The class BufferBlock from https://www.nuget.org/packages/System.Threading.Tasks.Dataflow/

The new System.Threading.Channels package: https://devblogs.microsoft.com/dotnet/an-introduction-to-system-threading-channels/