wolfpld / tracy

Frame profiler
https://tracy.nereid.pl/
Other
9.95k stars 657 forks source link

Would it make sense to allow reuse for the allocated source locations? #269

Closed nagisa closed 3 years ago

nagisa commented 3 years ago

I have a use-case where I would like to enter and exit the same zone many times over. Conceptually, something like this seems like it would be a super-optimal way to approach the task (in semi-pseudocode):

// `location` is initialized only once on first execution. Assume necessary synchronization etc. is in place.
static auto location = ___tracy_alloc_srcloc_name(__LINE__, __FILE__, strlen(__FILE__), __FUNC__, strlen(__FUNC__), "name", 4);
auto zone = ___tracy_emit_zone_begin_alloc(location, 1);
//...
___tracy_emit_zone_end(zone);

However today this code is invalid, because the data behind the location identifier is deallocated when the client sends out the zone. Therefore on the next execution of this code the ___tracy_emit_zone_begin_alloc will be referencing invalid (deallocated) data.

I wonder if it would make sense to send out the srcloc as a separate message to the profiler, instead? We could enable efficient reuse of the locations and significantly reduce the amount of data transferred between the application being profiled and the profiler (by the virtue of not sending the same locations over and over again).

WDYT?

wolfpld commented 3 years ago

How does this differ from using tracy::SourceLocationData?

nagisa commented 3 years ago

Huh, so I think you're right, that this already exists as a concept. In particular the APIs such as ZoneNamedS do this. From what I can see what happens is that this ultimately emits a QueueType::ZoneBegin{,Callstack} with this statically allocated SourceLocationData. I imagine that's why it works out well. ___tracy_emit_zone_begin_alloc on the other hand emits a QueueType::ZoneBegin{,Callstack}Alloc instead, hence the limitation.

I use ___tracy_emit_zone_begin_alloc and ___tracy_alloc_srcloc_name in the first place is because:

___tracy_alloc_srcloc_name is the only way to supply string length information directly, this is not allowed by ___tracy_source_location_data; and ___tracy_emit_zone_begin_alloc is the only way to use this allocation.

Would it be okay, then, to introduce an additional C API that works with ___tracy_alloc_srcloc_name but produces QueueType::ZoneBegin{,Callstack} instead?

What about ___tracy_emit_zone_begin_srcloc for the function name?

wolfpld commented 3 years ago

You may be overthinking things here. What you need is basically a static struct ___tracy_source_location_data pointer, which will enable you to use ___tracy_emit_zone_begin. You can create a small wrapper function which will allocate the needed memory in whatever way needed, possibly accepting a string length parameter which you need, and produce the structure to be assigned to a static data pointer. This memory doesn't need to be freed (in fact it can't be freed).

This should work like you proposed in the top comment.

nagisa commented 3 years ago

Hm, so the reason why I ended up using ___tracy_alloc_srcloc_name in the first place, was to avoid having to care about my strings not being null terminated. You are indeed right I could use ___tracy_source_location_data after allocating data whatever way necessary on my own.

I actually never had a chance to read through the worker code, doing so now it seems that the srcloc related code is quite significantly more complicated and I probably wasn't saving much of anything by avoiding the allocation necessary to add that trailing null byte…