davidbarsky / tracing-tree

Apache License 2.0
123 stars 31 forks source link

feat: detect recursive calls that can deadlock `HierarchicalLayer` #82

Closed mladedav closed 3 months ago

mladedav commented 3 months ago

This should address https://github.com/tokio-rs/tracing/issues/2635

If a tracing::event! is created while inside on_event, on_new_span, or on_close, the code can deadlock because of the bufs mutex.

This patch ensures that there are no recursive calls when locking is used by using a thread local atomic boolean which signals whether the call is recursive or not.

This is a very specific feature accommodating usage in rustc so it might be hidden behind a feature flag or some configuration?

oli-obk commented 3 months ago

lgtm. Tho it could use a test. lmk if you don't have time to add it, I'll give it a look then.

mladedav commented 3 months ago

I don't know how to test this. I would either have to make something that's already used log something. I didn't see anything like that when skimming it again. Otherwise, I'd need rustc make the internal write! macros to emit something like they do in the issues, but I assume that cannot be done with the distributed library and compiling stdlib as part of tests here seems a bit too expensive.

Alternatively, I could add a random #[cfg(test)] tracing::error!("Ignore me"); somewhere inside the on_event method, but that didn't seem appropriate either.

So I'd be happy to add tests, but I'm not sure how to go about that here.

oli-obk commented 3 months ago

Hmm... would it work if we used a custom tracing destination? (i.e. not the stderr default, but a custom writer, that will then just emit a tracing event once)

mladedav commented 3 months ago

You're right, that should work, I did not think of that. I'll try it out later today.

davidbarsky commented 3 months ago

we really should get tracing-mock out, which would be a great for this.

mladedav commented 3 months ago

I've added the test. I know usually tracing tests use with_default instead of things like set_global_default, but in this case that actually circumvents the bug. tracing internally calls get_default which returns NoSubscriber for nested calls if it is inside a scope with custom subscriber. So this had to go into its own integration test to not interfere with other stuff.

oli-obk commented 3 months ago

Very nice! Thanks!

mladedav commented 3 months ago

Could you also make a release when you get some time? It should help the people in the linked tracing issue.

oli-obk commented 3 months ago

done