open-telemetry / opentelemetry.io

The OpenTelemetry website and documentation
https://opentelemetry.io
Creative Commons Attribution 4.0 International
563 stars 1.21k forks source link

Getting Started for erlang/elixir doesn't work as described: no span output for request without setting up span explicitly #5289

Open tomwang57 opened 1 month ago

tomwang57 commented 1 month ago

I followed the guide in https://opentelemetry.io/docs/languages/erlang/getting-started/ to setup a phoenix project:

https://github.com/tomwang57/opentelemetry_test_elixir

When i didn't explicitly setup a span with Tracer.with_span, there was no span output. It was against what the guild said that there should be span output for every request by default.

svrnm commented 1 month ago

@tomwang57 thanks for reporting

@open-telemetry/erlang-approvers can you take a look?

ferd commented 1 month ago

Yeah I had a similar issue just a couple days ago when revising how one of my apps sets up its tracing, where with_span isn't an acceptable macro because of how my workflow is set up.

Whenever I used start_span and end_span together, I ended up not getting output. Looking into the ETS tables of my node however, I could see that the spans were started, just never ended.

As it turns out, I think the default macro'd end_span implementation requires context to be set up more thoroughly otherwise it doesn't know what to end.

I ended up having to set up my own wrappers that specifically replicate what with_span does:

  1. start the span (and track the span context it returns)
  2. call the tracer to set the span from 1. as the active span
  3. attach the context to get a token

Step 2 is the one that is critical to get the span to close on its own when calling end_span. However with only this one extra step, all future spans will be considered children of the span that was just closed rather than belonging to the parent previous to it that might still be open. So the third step and the following 4..6 steps are required for the span termination:

  1. reusing the span context from 1. and passing it to end_span's non-macro version
  2. setting the current context again
  3. detaching from the token

This makes it behave more in the way you'd expect and be in line with with_span.

I know I've discussed these issues privately with @tsloughter recently, I don't know if he has better ideas of what to do about it rather than just keep that complexity in place or always advocate for with_span

bryannaegele commented 1 month ago

Python created a convenience function of start_as_current_span which we could look at adding. I'd rather we use any existing conventions from other langs where there's not something currently in the spec. I haven't done a survey of all languages, just happen to currently be working in some python code.

tsloughter commented 1 month ago

We can't have a start_as_current_span like Python. It is a context manager.

Exiting the context manager will call the span’s end method, as well as return the current span to its previous value by returning to the previous context.

So we can only offer with_span for anything that deals with contexts.

tsloughter commented 1 month ago

And actually, I think @tomwang57 is only referring to the Phoenix instrumentation not working correctly, right?

bryannaegele commented 1 month ago

I think he means "if I replace with_span with start and end".

For the start_as_current_span I mean to do that as a macro like Fred's wrappers but try to match naming in other langs if there's some consensus out there.

tomwang57 commented 1 month ago

And actually, I think @tomwang57 is only referring to the Phoenix instrumentation not working correctly, right?

Yes, I've no idea about what others said. My concern is there should be default span info(root span) output for each request. And with_span will add other sub-spans.

bryannaegele commented 1 month ago

The example linked repo is using bandit, not cowboy, so that is expected behavior right now. Please set your project to use cowboy.

If that's a fork of the current example, that is an error in there

tomwang57 commented 1 month ago

The example linked repo is using bandit, not cowboy, so that is expected behavior right now. Please set your project to use cowboy.

If that's a fork of the current example, that is an error in there

So is there any counterpart option for bandit like opentelemetry_bandit?

svrnm commented 1 month ago

@tomwang57 any reason why you closed this issue? This does not seem to be resolved?

tomwang57 commented 1 month ago

@tomwang57 any reason why you closed this issue? This does not seem to be resolved?

I used bandit, not cowboy. So as @bryannaegele said(https://github.com/open-telemetry/opentelemetry.io/issues/5289#issuecomment-2390357400), it's expected behavior.

svrnm commented 1 month ago

Ah thanks for clarification! Is it worth to add a note to the page such that others will not face the same issue?