elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.22k stars 525 forks source link

processor/otel: translate OpenTelemetry exceptions from .net #5795

Open marcusjsford opened 3 years ago

marcusjsford commented 3 years ago

APM Server version: apm-server:7.13.4 Elastic version: elasticsearch:7.13.4 Kibana version: kibana:7.13.4 Otel-Collector version: otelcontribcol v0.30.

When I send a span with a span event exception to apm-server via the OTLP protocol apm-server correctly translates the span into an error but the stacktrace does not show on the kibana UI. I can see the Spans coming through into Elastic and view them in the Kibana UI. I can see that the SpanEvent has been correctly translated into an error but the dotnet stacktrace is missing from the UI. In the apm-7.13.4-error-000001 index i can see the error created.

Steps to reproduce:

  1. Run APM Server
  2. Instrument a .Net application with OpenTelemetry, configure it with an OtlpExporter and an endpoint of the Otel-Collector. 3.Configure the Otel-Collector to receive otlp and export to otlp/elastic
  3. Simulate application exception
  4. Check that the exception is recorded as an error in Elasticsearch, and observe that the stack trace is missing from the span or error UI.

Otel-Collector Config

 ... 
 pipelines:
    metrics:
      receivers:
        [otlp]
      processors:
        [batch]
      exporters:
        [otlp/elastic]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, otlp/elastic]

Output from Otel-Collector

Span   #1
    Trace ID       : ce85f09822e57f438662c0f264fd5628
    Parent ID      : c7c38a5214f06e49
    ID             : 31244485eda65d44
    Name           : Sub Work
    Kind           : SPAN_KIND_INTERNAL
    Start time     : 2021-07-23 09:17:15.8080915 +0000 UTC
    End time       : 2021-07-23 09:17:15.8941822 +0000 UTC
    Status code    : STATUS_CODE_ERROR
    Status message :
Attributes:
     -> exception.type: STRING(ActivityTracker)
     -> exception.message: STRING(Message:SomethingBroke)
     -> exception.stacktrace: STRING(System.ArgumentNullException: Message:SomethingBroke
 ---> System.Exception: new-exception
   --- End of inner exception stack trace ---)
     -> telemetry.sdk.language: STRING(dotnet)
Events:
SpanEvent   #0
     -> Name: exception
     -> Timestamp: 2021-07-23 09:17:15.8930924 +0000 UTC
     -> DroppedAttributesCount: 0
     -> Attributes:
         -> exception.type: STRING(ActivityTracker)
         -> exception.message: STRING(Message:SomethingBroke)
         -> exception.stacktrace: STRING(System.ArgumentNullException: Message:SomethingBroke
 ---> System.Exception: new-exception
   --- End of inner exception stack trace ---)

Screenshots showing the expected error but missing the stacktrace

image image (2) image (1)

marcusjsford commented 3 years ago

I think this may need extension to add a .net stacktrace parser. https://github.com/elastic/apm-server/blob/1ae881bb4c2358065fe0e33d43a77e9108fcea12/processor/otel/exceptions.go#L85-L91

cyrille-leclerc commented 3 years ago

Thanks Marcus for your investigation, we will look at it with the team

cdroulers commented 3 years ago

I might be way off since I'm just starting to use APM server and OTLP in Elastic, but I could live without having the UI show me the stacktrace in that window, but it seems the entire stack trace is not ingested, as it does not appear in the metadata tab OR the document (found through Discover tab). Am I missing something?

axw commented 3 years ago

@cdroulers it's by no means ideal, but the stacktrace should be visible when looking the document in JSON format under Discover. Here's an example of an error with a stacktrace captured using opentelemetry-go:

image

The server records the stacktrace as an exception attribute like this when it doesn't know how to parse it.

cdroulers commented 3 years ago

@axw I ended up finding a few hours later, thanks for the help. I'm encountering a lot of small annoyances like this and complexity in allowing longer fields than 1024 characters (including SQL statements!). Still reading the docs.

Thanks for the reply!

a-vivona commented 2 years ago

@axw I confirm that we can indeed consult the .NET stack trace from the Discover page. image

While waiting for your team to find a solution, it would be perhaps relevant to indicate another message than "No stack trace available" in this context where a stack trace is however well transmitted.

Maybe indicate that only the stack trace in Java format is currently viewable.

This would avoid that other developers search unnecessarily the origin of the problem.

axw commented 2 years ago

While waiting for your team to find a solution, it would be perhaps relevant to indicate another message than "No stack trace available" in this context where a stack trace is however well transmitted.

@a-vivona thanks, I agree that would be a good idea. I'll discuss this with the APM UI team to see what we can do. We may be able to display the original, unparsed, stack trace in this case.