wolfpld / tracy

Frame profiler
https://tracy.nereid.pl/
Other
10.25k stars 692 forks source link

Frame and zone times desync/rollover after 1d #846

Open kkartaltepe opened 4 months ago

kkartaltepe commented 4 months ago

Setting up a long running process with on-demand tracing and it appears that timestamps rollover or the profiler viewer otherwise misrenders timestamps in traces longer than 1d or so.

I'm using tracy from b5b985d1d09e2c757338a51f8a82115626bf81ab on linux with ON_DEMAND and with the clock source replaced with CLOCK_MONOTONIC.

When connecting to the target it defaults to the frame markers which appear to be correctly synced at +1d since process start, but when loading up the sample trace it will sync to the first zones at -1d which seems rolled over/desynced from the frame markers. I attached the trace (sorry its just named .zip to appease github).

traces made earlier in the program life are normal, zones and frame markers are synced.

problematic_trace.zip

wolfpld commented 4 months ago

Hah! I've been thinking about this literally yesterday and I was wondering if this would be a problem.

The documented time limit of a trace is 2^47 ns, or about 1.6 days. The timestamps sent over the network are measured in the TSC register increments. On my system the conversion from TSC to ns is performed by multiplying the TSC reading by 0.22. With CLOCK_MONOTONIC the measurement is done in ns, so the impact is less significant, as the timer advances more slowly.

Timestamps are stored as i64, with a "random" start offset and there is no overflow handling anywhere. This is why you are getting negative values.

This may be easy to fix if there is an uninterrupted flow of events, as the overflow is easy to detect and remediate. In the on-demand case it is more problematic, as there is nothing that would watch the timestamp when no connection is present, so any overflows in a long-running scenario would be lost.

kkartaltepe commented 4 months ago

The documented time limit of a trace is 2^47 ns, or about 1.6 days.

I figured. To give some more flavor of what I'm doing: I have been instrumenting my compositor to try and probe issues after it has been running for a while or just ad-hoc if I notice something odd. So the program is running all the time but any given trace recording would be pretty short often just a few seconds.

And my dream would be just a blackbox mode where the program could trigger a dump of the last X frames/X seconds/X MB of tracing information for later analysis when something interesting like a missed frame timing happens.

wolfpld commented 4 months ago

And my dream would be just a blackbox mode where the program could trigger a dump of the last X frames/X seconds/X MB of tracing information for later analysis when something interesting like a missed frame timing happens.

The capture utility supports both time and memory limits, so you could write a script that does, say, a minute-long trace in a loop, and then you could just leave the last 10 traces performed, so you would get a 10-minute history of what your program was doing.