amzn / hawktracer

HawkTracer is a highly portable, low-overhead, configurable profiling tool built in Amazon Video for getting performance metrics from low-end devices.
MIT License
133 stars 31 forks source link

ht_timeline_flush is not thread safe #59

Closed ftanuma closed 4 years ago

ftanuma commented 5 years ago

I just realized that ht_timeline_flush() is not a thread safe API even if we specify true for HT_Boolean thread_safe parameter in ht_timeline_create().

If it is design or limitation, it is fine but it would be great if there is good documentation about what "thread_safe" for ht_timeline_create() covers.

As a side note, I discovered this when I tried to call ht_timeline_flush() for each timelines before I call ht_file_dump_listener_flush() so that I can make sure that all the data so far will be flushed to listener.

Thanks!

loganek commented 5 years ago

Thanks for the report. This behaviour is definitely not by design - ht_timeline_flush() should be thread safe when thread_safe flag is enabled (otherwise, the function is not thread safe). I've commited the fix to the master branch.

Btw. Did you observe the bug by using one of the listeners in hawktracer core? All of them are thread-safe by themselves, so even though ht_timeline_flush() had a bug, they should work as expected.

ftanuma commented 5 years ago

I just discovered this while I was running my app with TSAN.

At that time, I was doing like this. // sudo code for (auto timeline: allTimelines) { ht_timeline_flush(timeline); } listener->flush();

This was to make sure all the remaining data in each timeline will be flushed to listener before I flush the listener. Then I realized that if I change the buffer size of timeline to 0, the event data is always flushed to listener so I don't need to flush each timeline like this. (But then I want option to do this for global timelines :) )

Though, this is kind of trade off: If I make timeline buffer size to 0, it always put data to listener, which I set it to thread-safe. That means, each event record will need to lock once. If I understand correctly.

loganek commented 4 years ago

OK, thanks for clarifications. I'm resolving the issue, as ht_timeline_flush() is thread-safe now.