Open aktau opened 1 week ago
(I don't know if this needs to be a proposal. It's not really a noteworthy feature, as it's an optional addition and only concerns very few users of a sub-tool, please adjust accordingly.)
In the new trace format, we do usually store absolute (monotonic) timestamps. The start of each batch has the full absolute timestamp. Each event within the batch contains the time delta from the previous event.
The exception is Windows, where time has low resolution, so in that case we use CPU ticks (RDTSC).
cc @felixge
That's nice. I don't know much about catapult, but could it be made to display things in absolute time?
In theory, I believe this field just needs to be in absolute time: https://cs.opensource.google/go/go/+/master:src/internal/trace/traceviewer/format/format.go;l=25;drc=70f4717e96cf9d8ca8a5f439d7d6487ec822ce49;bpv=1;bpt=1
The current values tend to just come from computations like https://cs.opensource.google/go/go/+/master:src/cmd/trace/goroutinegen.go;l=137;drc=70f4717e96cf9d8ca8a5f439d7d6487ec822ce49;bpv=1;bpt=1, which currently are event.Time - startTime
.
The format says the time should be in microseconds. I don't know how nicely Catapult will display the values, given they will probably be huge.
That's just reading AFAICT. I spent some time looking for writing and found traceClockNow
: https://github.com/golang/go/blob/f2d118fd5f7e872804a5825ce29797f81a28b0fa/src/runtime/tracetime.go#L57-L62
nanotime is (on Linux) clock_gettime(CLOCK_MONOTONIC)
. Which is (quoting man7) time since boot:
CLOCK_MONOTONIC
A nonsettable system-wide clock that represents monotonic time
since—as described by POSIX—"some unspecified point in the
past". On Linux, that point corresponds to the number of sec‐
onds that the system has been running since it was booted.
The CLOCK_MONOTONIC clock is not affected by discontinuous
jumps in the system time (e.g., if the system administrator
manually changes the clock), but is affected by the incremen‐
tal adjustments performed by [adjtime(3)](https://man7.org/linux/man-pages/man3/adjtime.3.html) and NTP. This clock
does not count time that the system is suspended. All
CLOCK_MONOTONIC variants guarantee that the time returned by
consecutive calls will not go backwards, but successive calls
may—depending on the architecture—return identical (not-
increased) time values.
The absolute value can thus be calculated if we knew the time since boot. But I haven't been able to find that anywhere in a trace. Did I miss it?
The absolute value can thus be calculated if we knew the time since boot. But I haven't been able to find that anywhere in a trace. Did I miss it?
I think I am confused by what you mean by "absolute time". I thought that meant you wanted the CLOCK_MONOTONIC value, which we have. Do you actually mean the CLOCK_REALTIME value (wall time)? We don't record that. I don't think that would be as useful for correlation with other trace events on the machine, since the real time clock can change arbitrarily.
Do you actually mean the CLOCK_REALTIME value (wall time)? We don't record that. I don't think that would be as useful for correlation with other trace events on the machine, since the real time clock can change arbitrarily.
For correlation with other trace events on the machine, CLOCK_MONOTONIC
is fine (ideal even), if one has these events available from the same epoch (boot time in Linux's case).
In my case, I don't always:
clock_gettime(CLOCK_MONOTONIC)
somewhere, I don't have a way to line up this timestamp with trace timestamps.Specifically, I get a crash which has exported a trace around crash time using the flight recorder. The monitoring service tells me the crash timestamp is 01:03:53 21-10-2024 UTC
. I'd like to know when the trace started, and until what time before the crash it actually ran. I'd like the "when" to be absolute here, not some machine-relative timestamp.
It seems to me like this could be solved by (also) adding a timestamp relative to the UNIX epoch in every generation event. Then, tooling (go tool trace
or something custom) could render all events in either a relative timeline (current default) or an absolute timeline.
That, or I'm missing something that currently exists, or a different way to correlate the data I want. Any ideas?
That makes sense. There is a caveat that it will probably be difficult to precisely line up events using wall time, but I agree it would be useful even just for getting a vague sense of where within logs, etc the trace occurs (within a few hundred ms).
I think including a real time timestamp in each generation would be reasonable. I think that would be sufficient. If the real time clock changes during tracing then computed timestamps may be wrong, but that should be rare.
(cc @mknyszek for thoughts on including a real time timestamp in each generation)
There is a caveat that it will probably be difficult to precisely line up events using wall time, but I agree it would be useful even just for getting a vague sense of where within logs, etc the trace occurs (within a few hundred ms).
Up to a few hundred ms is much better than what's currently available. It would already help me. I wonder how easy it is to get the equivalent of time.Now().UnixNano()
in the runtime...
runtime.walltime
We could just add this as a new event type that always lives in the same batch as EvFrequency
, which currently always lives in its own batch. Old toolchains won't understand it and fail, but new toolchains will still be able to understand old traces without having to interpret anything differently.
With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync
event type; we could then also emit a Sync
event at the start of the trace so we have a place to put the information for the first generation. Sync
events always precede a generation, so you can always count on having the value you need to interpret all timestamps in a given generation.
I don't think this needs to be a proposal. The trace parsing API is still experimental.
We could just add this as a new event type that always lives in the same batch as EvFrequency... With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync event type.
I don't know much about this, but this sounds contradictory, unless internal trace events and trace API events are fully decoupled. In which case, SGTM.
We could just add this as a new event type that always lives in the same batch as EvFrequency... With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync event type.
I don't know much about this, but this sounds contradictory, unless internal trace events and trace API events are fully decoupled. In which case, SGTM.
Sorry, yes. They are completely decoupled. The Sync
event type is an event type in the parsing API, the new event type would be part of the wire format.
Proposal Details
Runtime traces don't contain (or display) an absolute timestamp for events. This would be useful for comparing traces with other (tracing) tools, logs et cetera.
An example would be combining a Go runtime trace with perf sched events. This would allow better investigation of kernel scheduling events and how they impact the latency of goroutines.
IIRC, runtime traces currently use a relative (monotonic) timestamps for performance reasons. It's fine to keep this as-is. But it would be helpful if (e.g.) the start of a trace batch could contain an absolute system time (perhaps something lie nanoseconds since the UNIX epoch) together with the relative timestamp it was derived from. Tools like
go tool trace
could then adjust all the relative timestamps post-hoc, or at least display the trace start and end time.cc @mknyszek