open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.88k stars 2.26k forks source link

Unable to export to Jaeger from OTEL collector #15291

Closed shrutich91 closed 1 year ago

shrutich91 commented 1 year ago

Describe your environment I am using the following docker compose

Jaeger Docker:

docker run -d --name jaeger-all-in-one -p 16686:16686 -p 14250:14250 -p 14268:14268 jaegertracing/all-in-one

OTEL Collector Docker:

docker run --rm -it -p 4317:4317 -p 4318:4318 -v $(pwd)/examples/otlp:/cfg otel/opentelemetry-collector:0.59.0 --config=/cfg/opentelemetry-collector-config/config.dev.yaml

OTEL Collector Config

exporters:
jaeger:
endpoint: jaeger:14250
tls:
insecure: true
logging:
loglevel: DEBUG
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
service:
pipelines:
traces:
receivers:
- otlp
exporters: [logging, jaeger]
logs:
receivers:
- otlp
exporters:
- logging
metrics:
receivers:
- otlp
exporters:
- logging

What is the expected behavior? I want to export traces from collector to jaeger but getting an error

What is the actual behavior?

kind": "exporter", "data_type": "traces", "name": "jaeger", "state": "CONNECTING"}
2022-10-19T05:55:59.597Z warn zapgrpc/zapgrpc.go:191 [core] [Channel https://github.com/open-telemetry/opentelemetry-cpp/pull/1 SubChannel https://github.com/open-telemetry/opentelemetry-cpp/issues/2] grpc: addrConn.createTransport failed to connect to {
"Addr": "jaeger:14250",
"ServerName": "jaeger:14250",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup jaeger on 192.168.65.5:53: read udp 172.17.0.2:43000->192.168.65.5:53: i/o timeout" {"grpc_log": true}
2022-10-19T05:56:00.597Z info jaegerexporter@v0.59.0/exporter.go:186 State of the connection with the Jaeger Collector backend {"kind": "exporter", "data_type": "traces", "name": "jaeger", "state": "TRANSIENT_FAILURE"}
jpkrohling commented 1 year ago

I believe this is a basic Docker networking issue, but I'm adding the Jaeger label to this one.

cc @frzifus

frzifus commented 1 year ago

hi @shrutich91, i agree with @jpkrohling. You should be able to fix it by adding --network=host to both commands and replace endpoint jaeger:14250 by localhost:14250.

shrutich91 commented 1 year ago

Even with --network=host in both docker commands, and setting the endpoint to localhost:14250, I get the same error.

2022-10-20T05:12:06.328Z        warn    zapgrpc/zapgrpc.go:191  [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
  "Addr": "localhost:14250",
  "ServerName": "localhost:14250",
  "Attributes": null,
  "BalancerAttributes": null,
  "Type": 0,
  "Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:14250: connect: connection refused"    {"grpc_log": true}
2022-10-20T05:12:07.326Z        info    jaegerexporter@v0.59.0/exporter.go:186  State of the connection with the Jaeger Collector backend                                  {"kind": "exporter", "data_type": "traces", "name": "jaeger", "state": "TRANSIENT_FAILURE"}
jpkrohling commented 1 year ago

Could you please try to get it working outside of Docker first? If it works, then you know the problem is related to the network.

navin-rai commented 1 year ago

Even I am facing the same issue, @shrutich91 , did you find any solution ?

frzifus commented 1 year ago

@navin-rai did you try to run the collector out side of Docker as @jpkrohling has proposed?

navin-rai commented 1 year ago

@navin-rai did you try to run the collector out side of Docker as @jpkrohling has proposed?

I have deployed OTEL-COLLECTOR-Contrib on Kubernetes (locally [docker-desktop]) and I have spinned jegaer:all-in-one in docker, still facing the same issue.

frzifus commented 1 year ago

in Kubernetes or docker-desktop? Could you verify that both container (collector and jaeger instance) run in the same docker network and the jaeger instance is reachable with the hostname jaeger? I guess you can use docker inspect for that.

I assume its a network issue, since connecting from a local build works fine.

Start Jaeger

docker run --rm -it --network=host jaegertracing/all-in-one:1.36.0

Run local collector

./bin/otelcontribcol_linux_amd64 --config config.yaml

Config

---
receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  logging:
    logLevel: debug
  jaeger:
    endpoint: localhost:14250
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging, jaeger]
github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

anilknayak commented 1 year ago

running docker has problem I think. If you run the following command it will work

Jaeger Docker sudo docker run -d --name jaeger \ -p 16686:16686 \ -p 14268:14268 \ -p 14250:14250 \ jaegertracing/all-in-one:1.17 --log-level=debug

OpeTelemetry Collector Download the appropriate collector from release page $ wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.69.0/otelcol_0.69.0_linux_amd64.deb $ sudo dpkg -i otelcol_0.69.0_linux_amd64.deb

run /usr/bin/otelcol --config otel-collector-config.yaml

From code specify the Otel Collector endpoint it will work. The Docker container does not work. I have tried lots of network stuff still does not work.

frzifus commented 1 year ago

@anilknayak I agree. Its the same thing like https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/15291#issuecomment-1303151050.

It is important that both containers (otel + jaeger) can reach each other. For this, it must be ensured that both are located in the same virtual network. As an alternative, you can use the host network.

Example

$ docker run -d --rm -it --network=host jaegertracing/all-in-one:1.36.0
$ docker run -d --rm -it --network=host -v $(pwd)/cfg.yaml:/config.yaml:z otel/opentelemetry-collector-contrib:0.59.0 --config=/config.yaml
$ tracegen -otlp-insecure -duration 2s 

image

Ronzan commented 1 year ago

Is using the host network really the only solution? I have the same problem, using docker-compose and all the containers are in the same network.

I'm running Docker on Windows. The strange thing is, when I ran the same stack a couple of weeks ago with Docker switched to "Linux containers", it was working just fine. Now I'm running the containers with Docker switched to "Windows containers" and the "platform=linux" setting on the Linux containers.

Everything except pushing traces to Jaeger works. Could it be some sort of Docker Windows networking issue?

Both Jaeger OTEL-collector images were pulled today using latest tag.

Error in OTEL-collector pushing to Jaeger: image

Network of the stack: image

docker-copose.yml:

 prometheus:
    image: homemade/prometheus-win:2.41.0nano2019
    container_name: prometheus
    platform: windows
    ports:
      - "30090:9090"
    environment:
      - TZ=GMT+0
    volumes:
      - ./prom:C:/config

 grafana:
    image: grafana/grafana
    container_name: grafana
    platform: linux
    environment:
      - TZ=GMT+0    
    ports:
      - "30091:3000"
    volumes:
      - ./grafana-data/data:/var/lib/grafana
    depends_on:
      - prometheus

 jaeger-all-in-one:
    image: jaegertracing/all-in-one:latest
    container_name: jaeger-all-in-one
    platform: linux
    command: ["--prometheus.server-url=http://prometheus:9090"]
    environment:
     - METRICS_STORAGE_TYPE=prometheus
     - TZ=GMT+0
    ports:
      - "30092:16686"
      - "14268"
      - "14250"

 otel-collector:
    image: otel/opentelemetry-collector
    container_name: otel-collector
    platform: linux
    environment:
      - TZ=GMT+0    
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel:/etc
    ports:
      - "31888:1888"   # pprof extension
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
      - "13133:13133" # health_check extension
      - "4317:4317"   # OTLP gRPC receiver
      - "4318:4318"   # OTLP http receiver
      - "55679:55679" # zpages extension
    depends_on:
      - jaeger-all-in-one
      - prometheus
EricBuist commented 1 year ago

It seems that Jaeger all in one is not listening to port 14250, supporting only Thrift, not GRPC. There is thus no chance the OpenTelemetry Collector will be able to export to it. We either need a docker-compose configuration file with containers for each individual Jaeger service, Jaeger all-in-one be enhanced to listen to GRPC not just Thrift, or OpenTelemetry Collector export to Thrift in addition to GRPC.

EricBuist commented 1 year ago

The all in one Docker image needs to be recent enough. I mistakenly used 1.6, mislead by tags on Docker Hub. You need to use something like 1.42, or latest. Alternative to using host as a network is to create a virtual network and link the containers to it.

jayanth151002 commented 1 year ago

I have been spending considerable time on this but still couldn't figure out what the issue is. Just to confirm if whatever I am doing is right, does the collector listen on the http port 4318 for telemetry data and jaeger on http port 14250?

this is my config.yaml file:

receivers:
  otlp:
    protocols:
      http:

exporters:
  jaeger:
    endpoint: localhost:14250

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [jaeger]
shubhamjadhav1896 commented 10 months ago

I am trying to use Jaeger as exporter in kubernetes but getting error message as below:

Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'exporters': unknown type: "jaeger" for id: "jaeger" (valid values: [opencensus prometheusremotewrite logging otlp otlphttp file kafka debug prometheus zipkin])
2023/10/21 04:55:21 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'exporters': unknown type: "jaeger" for id: "jaeger" (valid values: [opencensus prometheusremotewrite logging otlp otlphttp file kafka debug prometheus zipkin])

Is there something I am missing here? Or is it like Jaeger has been deprecated in the latest version of OpenTRelemetry ?

exporters:
      logging:
      jaeger:
        endpoint: "jaeger-collector.observability.svc.cluster.local:14250"
        tls:
          insecure: true
jesse-c commented 10 months ago

@shubhamjadhav1896: Are you using opentelemetry-collector-contrib? I'm facing the same problem on v0.87.0. I had a look, and yup, support for the Jaeger Exporter was dropped in v0.86.0 [1].

[1] https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.86.0

shubhamjadhav1896 commented 10 months ago

@shubhamjadhav1896: Are you using opentelemetry-collector-contrib? I'm facing the same problem on v0.87.0. I had a look, and yup, support for the Jaeger Exporter was dropped in v0.86.0 [1].

[1] https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.86.0

Yes. I am using the same. And the issue has been sorted for me. In place of jaeger, you should try using OTLP protocol as this has been a standard protocol which Jaeger community has mentioned and they have removed the client library of Jaeger protocol from May 2022.

jesse-c commented 10 months ago

@shubhamjadhav1896: Thank you!

shubhamjadhav1896 commented 10 months ago

@shubhamjadhav1896: Thank you!

Welcome 😊

exrhizo commented 9 months ago

This helped me, I hadn't changed the port also

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/26675/files

Guilospanck commented 8 months ago

This helped me: https://opentelemetry.io/blog/2023/jaeger-exporter-collector-migration/