Celtoys / Remotery

Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Apache License 2.0
3.03k stars 255 forks source link

Assert when destroying global instance while sample tree isn't empty #231

Closed britzl closed 1 year ago

britzl commented 1 year ago

We recently switched to Remotery for the Defold engine and for debug builds we call rmt_CreateGlobalInstance on engine startup and rmt_DestroyGlobalInstance on engine shutdown. This works well.

I'm now playing around with an API to let developers explicitly start and stop Remotery at any time and while doing so I'm experiencing a crash when calling rmt_DestroyGlobalInstance:

Assertion failed: (allocator->nb_inuse == 0), function ObjectAllocator_Destructor, file Remotery.c, line 2256.
INFO:CRASH: Successfully wrote Crashdump to file: /Users/bjornritzl/Library/Application Support/Defold/_crash
ERROR:CRASH: CALL STACK:

# 0 pc      0x69dfd libc++abi.dylib _sigtramp+29
# 1 pc      0x1e00f libc++abi.dylib __pthread_kill+11
# 2 pc      0x541ff libc++abi.dylib pthread_kill+263
# 3 pc     0x218d24 libxpc.dylib abort+123
# 4 pc     0x2180cb libxpc.dylib err+0
# 5 pc     0x3620ff dmengine ObjectAllocator_Destructor+159
# 6 pc     0x3662b0 dmengine SampleTree_Destructor+176
# 7 pc     0x3660e7 dmengine ThreadProfiler_Destructor+71
# 8 pc     0x361edb dmengine ThreadProfilers_Destructor+139
# 9 pc     0x35f34f dmengine Remotery_Destructor+255
#10 pc     0x35f526 dmengine _rmt_DestroyGlobalInstance+166

What seems to happen is that there are samples left that haven't been free'd when I call rmt_DestroyGlobalInstance. I see that in SampleTree_Destructor there's a call to ObjectAllocator_Free on the root sample, but as far as I understand this does not free any child samples. Shouldn't the equivalent to FreeSamples be called in the SampleTree_Destructor to ensure that the root and child samples get free'd?

I've tried this myself and it sure seems to do the trick, but perhaps there is some nuance to this that I haven't seen? It was my colleague @JCash who did the main bulk of the work to migrate to Remotery and I've only just now started poking around in the code.

britzl commented 1 year ago

Hmm, I spoke too soon. I'm still getting a crash, although less frequently. So I guess what I'm wondering now is how to correctly start Remotery and then later call rmt_DestroyGlobalInstance to gracefully shut down Remotery?

dwilliamson commented 1 year ago

The first question: is this a GPU sample tree or a CPU sample tree?

On the CPU SampleTree::root is expected to have no children by the time you shutdown Remotery, because: when you close your last Sample Remotery chops them all off from the root and pushes them into a message queue for the Remotery thread to send away. See ThreadProfiler_Pop for that behaviour.

When the Remotery thread exits it calls Remotery_FlushMessageQueue which goes through all message queues and releases the allocated samples. See Remotery_Destructor for the guarantee that the Remotery thread is joined (blocks until exit) before the ThreadProfilers are deleted.

On GPU traces this is a bit more complicated and we can investigate more if this is your problem.

However: this assert strategy is designed only to be a debugging aid in the one init/shutdown per game scenario and is not designed to be called for multiple init/shutdowns.

So the options are:

Edit: In ThreadProfiler_Destructor check the value of index to determine the rmtSampleType.

britzl commented 1 year ago

@dwilliamson thank you for taking the time to write such a detailed answer!

The first question: is this a GPU sample tree or a CPU sample tree?

CPU

On the CPU SampleTree::root is expected to have no children by the time you shutdown Remotery

If this is a CPU sample tree asserting then there is a new bug that needs to be found.

Well, I was kinda brutal and called rmt_DestroyGlobalInstance immediately on user input while inside the engine main loop.

With risk of derailing this issue: I added a way for users to manually start and stop the profiler from their game code, instead of automatically starting the profiler when the engine starts and shutting it down again during engine shutdown. The reason why I'm doing this is that we have reports from users on Windows that their games are freezing on startup while running in debug mode (ie with Remotery enabled). All evidence is pointing towards anti virus or other security software blocking the game, probably because of the socket being opened (still under investigation). If we do not automatically start Remotery on engine startup and instead let the user start it manually once the engine is running it seems to be working. There is no problem on macOS, Linux, Android or iOS. Only Windows.

dwilliamson commented 1 year ago

Yes, that's brutal and we only support calling rmt_DestroyGlobalInstance when all of your CPU/GPU samples have been closed.

There is potential for supporting arbitrary close by making ObjectAllocator keep track of allocated samples as well but that's nowhere near my priority at the moment. If you're willing to live with the memory leaks for now then you can remove the assert.

With your startup freeze tests, have you tried disabling the automatic thread sampler and seeing what happens? rmtSettings::enableThreadSampler = RMT_FALSE.

britzl commented 1 year ago

With your startup freeze tests, have you tried disabling the automatic thread sampler and seeing what happens? rmtSettings::enableThreadSampler = RMT_FALSE.

It turns out that this was indeed what was causing the issue on Windows! Thank you for suggesting it!

dwilliamson commented 1 year ago

I think I need to add a runtime toggle to turn that on, so that it can be disabled by default.

It interferes with the debugger and can prevent you adding breakpoints.