hyperium / h2

HTTP 2.0 client & server implementation for Rust.
MIT License
1.34k stars 266 forks source link

Use span without a parent for connection #688

Open mladedav opened 1 year ago

mladedav commented 1 year ago

We have found out in our project that sometimes when calling a gRPC service, a connection is created during that call and that connection is pooled for later use. However, when the connection was made a span was created which used the current span as a parent.

This leads to our traces that should be a few milliseconds long to include the whole lifetime of the connection which times out after 90 seconds or 3 minutes.

This change makes it so that the span is a root span.

I have asked in the tracing-users Discord about this and got this reply so I hope this is ok.

I've also thought about adding an optional span to be used as a parent into the Config struct but it does not seem useful enough to warrant a breaking change there, but if anyone thinks that it should be there, I will do that. I have just checked that it's used from within Handshake which creates its own trace_span so the propagation could touch a few more things.

seanmonstar commented 1 year ago

@hawkw can you take a look and provide guidance?

mladedav commented 1 year ago

@hawkw Any thoughts on this?

hawkw commented 1 year ago

Sorry, I was traveling for the last couple weeks. I'll take a look!

mladedav commented 5 months ago

Hi, any thoughts on this?

seanmonstar commented 5 months ago

Looking at this again, I'm thinking that just as someone might want there to be no parent, other people could very well want it to have one. Probably, most people would expect it to have the default in tracing.

So, then I suppose that a way to make it without a parent would be a level up: whatever thing creates or spawns the task could down with the parent set to none. That would then be up to the user. What do you think?

mladedav commented 5 months ago

Yes, I think you're right. This should be where h2 streams are created or even higher up. I'll try to look into this once more and find a better place and close this.

Thank you for the time.

tadovas commented 1 month ago

Hey. We are also observing same problem. Especially when using reqwest client (which uses hyper and h2 eventually under the hood). The first request span which creates h2 connection is later being used through all connection life time and really messes span times (visible very nicely in jaeger and similar tools). Suggestion to fix it one level up will end up in hyper handshake - by creating an span without parent and instrumenting hanshake future - however since hyper tracing is still in unstable, I am not even sure it will be accepted.

IMHO creating spans in connection with whatever the current parent is - would be a better approach and will be more consistent: in single task envs it will still be the same parent which created a connection. In hyper case - hyper can decide that connection is being moved to separate task and create a span without parent accordingly.