open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.36k stars 1.44k forks source link

jaeger exporter missing critical data #1160

Closed nicks closed 4 years ago

nicks commented 4 years ago

Describe the bug I'm playing around with setting up opentelemetry with Jaeger for the first time, and Jaeger is spewing errors. They look like this:

{"level":"error","ts":1592618224.399752,"caller":"app/span_processor.go:138","msg":"process is empty for the span","stacktrace":"github.com/jaegertracing/jaeger/cmd/collector/app.(spanProcessor).saveSpan\n\tgithub.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:138\ngithub.com/jaegertracing/jaeger/cmd/collector/app.ChainedProcessSpan.func1\n\tgithub.com/jaegertracing/jaeger/cmd/collector/app/model_consumer.go:35\ngithub.com/jaegertracing/jaeger/cmd/collector/app.(spanProcessor).processItemFromQueue\n\tgithub.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:175\ngithub.com/jaegertracing/jaeger/cmd/collector/app.NewSpanProcessor.func1\n\tgithub.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:75\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\tgithub.com/jaegertracing/jaeger/pkg/queue/bounded_queue.go:77"}

Steps to reproduce

I'm using:

I have a pretty good isolated repro that I can create a repo for, if that would be helpful. It basically just uses the off-the-shelf docker containers for opentelemetry-collector and jaeger, plus the sample code for opentelemetry-go

What did you expect to see? No error

What did you see instead? The error above

What version did you use?

docker run --rm --env JAEGER_REPORTER_LOG_SPANS=true --network=host -v "${PWD}/otel-config.yaml":/otel-config.yaml --name otelcol otel/opentelemetry-collector --config otel-config.yaml

What config did you use?

extensions:
  health_check:
  pprof:
    endpoint: 0.0.0.0:1777
  zpages:
    endpoint: 0.0.0.0:55679

receivers:
  otlp:
    endpoint: "0.0.0.0:55680"

processors:
  batch:
  queued_retry:

exporters:
  logging:
    loglevel: debug
  jaeger:
    endpoint: "localhost:14250"
    insecure: true

service:
  extensions: [health_check, pprof, zpages]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, queued_retry]
      exporters: [jaeger, logging]

Environment docker

Additional context I also filed this bug upstream against Jaeger https://github.com/jaegertracing/jaeger/issues/2300 They believe that opentelemetry is missing critical data here

tigrannajaryan commented 4 years ago

There is no span_processor.go on this repo, it appears to be in Jaeger Collector's code. Are you using Jaeger Collector? If so then the investigation is best started there since errors are emitted by the code that is not in this repo.

If the investigation leads to the code in this repo we can look into it.

nicks commented 4 years ago

@tigrannajaryan good question! We're using the jaeger exporter. The error message (span_processer.go) is coming from Jaeger's code.

When I talked to the Jaeger people about it, they said that the problem is that opentelemetry-collector is not sending the right data, see discussion here: https://github.com/jaegertracing/jaeger/issues/2300

nicks commented 4 years ago

(To be clear, I do not have enough expertise to know if this is a jaeger bug or an opentelemetry bug, I'm just trying to make sure that both sides have enough information to figure out the problem!)

bogdandrutu commented 4 years ago

@pavolloffay please help us identify if this is on our side or jaeger side :)

pavolloffay commented 4 years ago

I will have a look at a first glance it looks like data translation issue.

pavolloffay commented 4 years ago

I wasn't able to reproduce this with https://github.com/open-telemetry/opentelemetry-go/blob/master/example/basic/main.go#L68 and

    exp, err := otlp.NewExporter(
        otlp.WithInsecure(),
        otlp.WithAddress("localhost:55680"),
        otlp.WithGRPCDialOption(grpc.WithBlock()), // useful for testing
    )

The wasn't any error log in Jaeger. However the service name was set to an empty string which does not work properly with Jaeger UI.

pavolloffay commented 4 years ago

I got the error, it happens when the process object in a span is nil.

{"level":"error","ts":1594720026.0424979,"caller":"app/span_processor.go:137","msg":"process is empty for the span","stacktrace":"github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).saveSpan\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:137\ngithub.com/jaegertracing/jaeger/cmd/collector/app.ChainedProcessSpan.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/model_consumer.go:35\ngithub.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).processItemFromQueue\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:174\ngithub.com/jaegertracing/jaeger/cmd/collector/app.NewSpanProcessor.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:74\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/pkg/queue/bounded_queue.go:77"}
{"level":"error","ts":1594720026.043216,"caller":"app/span_processor.go:137","msg":"process is empty for the span","stacktrace":"github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).saveSpan\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:137\ngithub.com/jaegertracing/jaeger/cmd/collector/app.ChainedProcessSpan.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/model_consumer.go:35\ngithub.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).processItemFromQueue\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:174\ngithub.com/jaegertracing/jaeger/cmd/collector/app.NewSpanProcessor.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/cmd/collector/app/span_processor.go:74\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/home/ploffay/projects/jaegertracing/jaeger/pkg/queue/bounded_queue.go:77"}

This has been fixed in https://github.com/open-telemetry/opentelemetry-collector/pull/1222.

This issue can be closed.

pavolloffay commented 4 years ago

@nicks the #1222 hasn't been released yet but you can use it with the latest tag from otel/opentelemetry-collector-dev

docker run -it --rm --env JAEGER_REPORTER_LOG_SPANS=true --network=host -v "${PWD}/config.yaml":/config.yaml --name otelcol otel/opentelemetry-collector-dev --config config.yaml
bogdandrutu commented 4 years ago

Closing this per comment in https://github.com/open-telemetry/opentelemetry-collector/issues/1160#issuecomment-658088279

nicks commented 4 years ago

yay thanks!