Closed ELLIOTTCABLE closed 1 year ago
This actually has non-obvious and subtle performance consequences — we've been talking about this internally at @Ahrefs for a bit. To quote @mfp,
tldr
- Lwt TLS works by associating a heterogeneous map to promises
- If you naïvely spawn async threads with
ignore (make_some_promise ())
or equivalent (which we do in multiple places, and is wrong for multiple reasons btw.), you're saving the whole map. Lwt has no way to know which keys will be used in subsequent lookups.- This can lead to leaks if you have long-running async threads that are launched in the middle of a promise with a heavy
Lwt.with_value
map.
Basically, it makes "using Lwt wrong in a mostly-harmless way, suddenly harmful." And 'it' in this case is a supposedly-transparent debugging and tracing library. Ew. I think we'll continue with this approach internally, but we have a more total control than an open-source library can have …
… unfortunately, not making this change is also problematic. (Actual)threads are complicated in a Lwt context … I'm not sure how to proceed.
If we merge this, it should be with some significant documentation, educating users up-front about how to be more principled and careful with their promises … idk. Thoughts?
One thing that might minimize the damage is if we can get away with only the span_id
, and ditching the assumption of access to any other metadata about the surrounding
context, possibly? At least in that case, any pathological cases are only leaking, idk, 8 bytes plus a little?
Or we can dislocate the rest of the metadata Scope.t
into a process-or-thread global map, I suppose?
type t = {
trace_id: Trace_id.t;
span_id: Span_id.t;
mutable events: Event.t list;
mutable attrs: key_value list;
}
Ah, so it's really a global map, I see. I thought it might have been stored in promises, like cancellation state, but this means this is a bit too tricky. Perhaps you could use an ahrefs overlay to experiment with this for a while?
This stores the implicit 'global' surrounding-scope off the callstack in Lwt's "sequence-associated storage" instead of our own Thread_local, as this makes more sense in the context of green threads.
I tried to match the code-style, let me know if you have any tweaks.