open-telemetry / opentelemetry-erlang-contrib

OpenTelemetry instrumentation for Erlang & Elixir
https://opentelemetry.io
Apache License 2.0
166 stars 118 forks source link

Difficulties in Exception reporting migrating from Spandex to OTel #306

Open apreifsteck opened 8 months ago

apreifsteck commented 8 months ago

Describe the bug I recently experimented with adding open telemetry to one of the apps my team owns. It's been over a week now since I made the swap and we've had a few exceptions. However, they all show up in DataDog with no stacktrace info and this message: exit:{{#{'__exception__' => true,...},[...]},{'Elixir.MyAppWeb.Endpoint',...}}.

It looks like the exceptions are coming in as events, which as far as I know is compliant with the OTel spec. image

It seems like Datadog doesn't handle errors this way, and instead they're included as span attributes (sorry about the redaction, point being is that there is a message and a full stacktrace there) image

It looks like I might need to add a span processor to add that span attribute. Perhaps this is more of a compatibility issue than anything else. Regardless, if this issue results in a bridge library or a migration guide to follow for Spandex, that would be much appreciated. Of course, I'd be happy to collaborate in any way I can!

Expected behavior I had expected something like what Spandex presents. image

Also, it seems like the exception.message attribute is missing on the event.

Additional context

apreifsteck commented 8 months ago

Upon doing some further experimentation, it looks like error reporting happens pretty well if the exception occurs inside a manually instrumented trace but not so good if it bubbles up to Cowboy.

On a whim I was going to try switching out to Bandit to see what that did. It looks like the Phoenix Telemetry package on Hex (1.2) doesn't support that yet, although it looks like there was a PR https://github.com/open-telemetry/opentelemetry-erlang-contrib/pull/249 not too long ago that added this. Is a new release for Phoenix Telemetry coming soon, by any chance?

grzuy commented 2 weeks ago

I suspect the following recent changes

https://github.com/open-telemetry/opentelemetry-erlang-contrib/pull/359/files#diff-4c2fd05f88775967cc821a019b047da19e9ab6db7a8a88a82fa6b3db5350b7bcR391-R422

image

should have fixed this issue.

I think it's released in opentelemetry_cowboy v1.0.0-rc.1.

@apreifsteck can you confirm?