wolfpld / tracy

Frame profiler
https://tracy.nereid.pl/
Other
9.84k stars 655 forks source link

Consider change `LuaZoneState.counter` to `int32_t` #551

Open kkpattern opened 1 year ago

kkpattern commented 1 year ago

Hi, thanks for this great tool.

We are writing a tracy python binding. Python has a PyEval_SetProfile API to set a profiler function in the interpreter. Every time a function is called or returned, this profiler function will be called. We can use PyEval_SetProfile to add tracy instruments automatically, which makes tracy a perfect match with python.

Since the profiler function is stateless, we can not simply use the tracy C API because there is no easy way to pass the ctx variable without hurting the performance. Therefore, we learned from the lua binding to use the LuaZoneState.counter to correctly match the zone begin and zone end.

PyEval_SetProfile can be called dynamically when the python program is running. This allows us to toggle the tracy profile online, which makes it even better. There is one little problem: when the PyEval_SetProfile is called and tracy profile is activated, we may already inside some python function. Then we will get more PyTrace_RETURN than PyTrace_CALL , meaning we may get a negative LuaZoneState.counter value. So we will hit the assertion. Even if we use the release build, the tracy server will still complain the mismatch zone begin and end.

But we can actually recover from a negative counter, as long as we don’t send any data when the counter is negative. So if we change LuaZoneState.counter from uint32_t to int32_t , we can solve the problem like this:

void tracy_enter_call(int lineno, const char *filename, const char *name) {
#ifdef TRACY_ON_DEMAND
    const auto zone_count = GetLuaZoneState().counter++;
    if (zone_count > 0 && !GetLuaZoneState().active) {
        return;
    }
    GetLuaZoneState().active = GetProfiler().IsConnected();
    if (!GetLuaZoneState().active) {
        return;
    }
    if (zone_count < 0)
        return;
#endif
    const auto srcloc = Profiler::AllocSourceLocation(lineno, filename, name);

    TracyQueuePrepare(QueueType::ZoneBeginAllocSrcLoc);
    MemWrite(&item->zoneBegin.time, Profiler::GetTime());
    MemWrite(&item->zoneBegin.srcloc, srcloc);
    TracyQueueCommit(zoneBeginThread);
}

void tracy_leave_call() {
#ifdef TRACY_ON_DEMAND
    const auto zone_count = GetLuaZoneState().counter--;
    if (!GetLuaZoneState().active) {
        return;
    }
    if (!GetProfiler().IsConnected()) {
        GetLuaZoneState().active = false;
        return;
    }
    if (zone_count <= 0)
        return;
#endif
    TracyQueuePrepare(QueueType::ZoneEnd);
    MemWrite(&item->zoneEnd.time, Profiler::GetTime());
    TracyQueueCommit(zoneEndThread);
}

The above code works perfectly for us. If changing LuaZoneState.counter to int32_t is ok for tracy, we’d like to create a PR for this.

Again, thanks so much for this wonderful tool!

wolfpld commented 1 year ago

The original code (written back in 2018) makes little sense to me now. There even are places where it just can't work (e.g. increase the uint counter, and then check if it's > 0). It should probably be rewritten from scratch.