Celtoys / Remotery

Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Apache License 2.0
3.11k stars 264 forks source link

Thread sampling hanging with debug CRT? #177

Open uucidl opened 3 years ago

uucidl commented 3 years ago

Hi,

I know it's silly to profile a program linked with the debug CRT, but I do sometimes use Remotery simply to see what's going on in our app, and that means sometimes in Debug. I noticed when upgrading to the latest Remotery that with the thread sampler on, I end up with deadlocks.

When this happens I'm in CheckForStallingSamples and which allocates (via rmtMalloc) ; we usually end up with a NtWaitForAlertByThreadId (from ucrtbased.dll!__acrt_lock, called from ucrtbased.dll!malloc).

Once this happens to our main thread, we basically hang.

For now I'm going to disable the thread sampler by default when building our debug builds, however I feel this should at least be documented in Remotery's readme. Or an even better solution.

dwilliamson commented 3 years ago

So if the thread that's been interrupted has any form of lock that Remotery can try to acquire, this will fail. That's totally not ideal and really, Remotery shouldn't be doing that much work while a thread is suspended, anyway.

I consider this a genuine bug and not behaviour to be aware of.

The whole sample tree needs to be copied when a stalling sample is detected so I think the solution here is to have an object allocator specific to the sampling thread that is pre-allocated with enough samples for a worst-case sample tree.

uucidl commented 3 years ago

Glad to hear! Yeah it feels more healthy not to use the CRT allocator in that code path.

By the way, I also noticed something when reviewing that part of the code. Above the call to CheckForStallingSamples you mention that SuspendThread is async and that the guarantee it has completed is that we called GetThreadContext before. However, the call to rmtGetUserModeThreadContext only happens if sample_count == 0 , at least that's what I understood.

dwilliamson commented 3 years ago

Yes, there is a chance for this to fail. It's miniscule and hasn't shown in testing yet but the chance is there, nonetheless.

As far as I can tell, there is no problem with moving this into the branch but this is multi-threaded programming, i.e. I'm probably wrong. The ideal route appears to be to suspend the thread only when sample_count is zero but that looks like it will require some testing.