Closed jasontt closed 3 months ago
A couple of contextual observations:
mem::forget(span)
in Rust, so we should check that nothing explodes if that's done;Although mem::forget(span)
is screaming "something is weird here", it is still possible to do something with our current scoped Span
s that breaks tracy's instrumentation well-formedness invariants:
let span1 = span!("hello");
let span2 = span!("world");
drop(span1);
drop(span2);
We should check that nothing explodes in this case either.
Currently the TRACY_NO_VERIFY flag is being defaulted to false.
That means verification remains enabled by default, correct? How does one go about disabling verification (there's no feature flag to enable/disable?)
Let me get back to you on your first point(s) - it seems initially hard to guarantee safety here, it raises a couple of questions to me:
Your final question is easy to answer, yes verification remains enabled by default (more precisely because the flag is not set). To disable verification you set the TRACY_NO_VERIFY
flag which is the change I made in build.rs
.
I don't think we can do much about guaranteeing well-formed use of tracy, and furthermore tracing-tracy
currently relies on the currently imperfect API to bridge between the two ecosystems where tracing
is more flexible in what it allows than tracy
.
Your final question is easy to answer, yes verification remains enabled by default (more precisely because the flag is not set). To disable verification you set the TRACY_NO_VERIFY flag which is the change I made in build.rs.
My question was leading somewhat. Once this change lands we'd compile with -DTRACY_NO_VERIFY
unconditionally. There isn't a mechanism for users to adjust this define if they needed to. IMO we should add something like we have for all other defines. Remember -- tracy-client-sys
is not necessarily only for tracy-client
to use, the raw C APIs are also as public interface for this project as are the wrappers from the higher-level crate.
Okay so a few thoughts to add here and a minor change to the commit itself:
First off I've put -DTRACY_NO_VERIFY
behind a flag, I've left it defaulted to false (running verification) for now, although I think there's a pretty good argument to be made for it to be defaulted true.
My thinking here is that this is Tracy you use this profiler to be as low overhead as possible and having a bunch of verification overhead is surprising. That said, I don't like that this isn't strictly sound, but I don't think there's much we can do there without exposing our own C API or converting back to Tracy in some manner.
Digging into their source a bit this problem doesn't actually seem to be constrained to Rust, we just handle it better by using their safer C API. It's perfectly possible for a C++ destructor to not get run and this exact same issue will arise, I'm actually mildly surprised they don't do something smart in their capture process here as there has clearly been some thought put into the catching of signals so this mode of failure has likely been considered...
Asynchronously of this change I have been investigating other modes of recording Spans safely in native Rust that can be consumed by Tracy. I do intend on trying to get this merged but I will likely be devoting more of my time to this effort as while this change has brought my measured instrumentation overhead down from 60ns -> 30ns on my machine, experimentally ~14.5ns seems to be achievable (two rtdsc calls + a mov/vectorised mov), the remaining overhead may be the lack of inlining of the C API or just Tracy itself but that's another matter.
I've inverted no-verify to verify, defaulted it to off and adjusted the docs. In hindsight many of Tracy's NO features are already inverted so this does make more sense.
Does setting TRACY_NO_VERIFY
make tracy-client's API unsound?
That's a good question. I was assuming that it was only "undefined behaviour" with regards to how the profiles will be interpreted by the profiler, not undefined behaviour proper seeing as its about failure to clean up the resources proper. If its the latter, then we indeed need to be quite a bit more wary of exposing it or enabling it by default.
"can cause undefined behavior" very clearly communicates "unsound" to me. A resource leak and/or garbage trace data is a much lesser issue than UB.
Tracy's manual confirms my understanding. And I indeed believe we don't need to have validation turned on at all times. The workflow I imagine here is going to be enabling verification once you do get a faulty profile to investigate where the fault is exactly.
Tracy's manual confirms my understanding
Then the README changes merged in this PR are incorrect. The README now says "can cause undefined behavior", but your claim is that they can only cause inconsistent profiling data.
I took care of adjusting the documentation before the release already.
Currently the
TRACY_NO_VERIFY
flag is being defaulted tofalse
.As per the Tracy Profiler User Manual
We are handling the scope lifetimes in rust through the
Span
struct thus I don't think we need this extra protection/overhead and importantly if you check the source code inTracyProfiler.cpp
some of this work is done after getting the time from the profiler. e.g.With a one line change to the build.rs this halves the time taken per iteration in the
span_callstack/0
benchmark on my machine and subtracts ~20ns from the timed portion of real spans I'm using in production code. "span_callstack/0" before:"span_callstack/0" after: