open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

Failed to export spans to jaeger api/traces #20700

Closed jcarranzan closed 1 year ago

jcarranzan commented 1 year ago

Component(s)

exporter/jaeger

Describe the issue you're reporting

Hi, I am not available to see my spans created in the test application on the jaeger UI or the jaeger endpoint : http://jaeger-query-opentelemetry-project.apps.my.openshift.cluster/api/traces?service=" + JAEGER_SERVICE_NAME I am not sure if I got anything missconfigured in my opentelemetry yml config or it's anything related to the application code regarding exporting the spans.

This is my opentelemetry.yml :

apiVersion: v1
kind: List
items:
  - apiVersion: apps.openshift.io/v1
    kind: DeploymentConfig
    metadata:
      name: 'jaeger'
    spec:
      selector:
        app: 'jaeger'
      replicas: 1
      template:
        metadata:
          labels:
            app: 'jaeger'
        spec:
          containers:
            - image: 'quay.io/jaegertracing/all-in-one:latest'
              name: 'jaeger'
              env:
                - name: "COLLECTOR_OTLP_ENABLED"
                  value: true
                - name: "JAEGER_SERVICE_NAME"
                  value: "test-traced-service"
              ports:
                - containerPort: 5775
                  protocol: UDP
                - containerPort: 6831
                  protocol: UDP
                - containerPort: 6832
                  protocol: UDP
                - containerPort: 5778
                  protocol: TCP
                - containerPort: 16686
                  protocol: TCP
                - containerPort: 16685
                  protocol: TCP
                - containerPort: 14268
                  protocol: TCP
                - containerPort: 9411
                  protocol: TCP
                - containerPort: 4317
                  protocol: TCP
                - containerPort: 4318
                  protocol: TCP
                - containerPort: 14250
                  protocol: TCP
                - containerPort: 14269
                  protocol: TCP
      triggers:
        - type: ConfigChange
      servicename: 'jaeger'
  - apiVersion: v1
    kind: Service
    metadata:
      name: 'jaeger-query'
      labels:
        app: 'jaeger'
    spec:
      ports:
        - name: query-http
          port: 16686
          protocol: TCP
          targetPort: 16686
      selector:
        app: 'jaeger'
  - apiVersion: v1
    kind: Service
    metadata:
      name: 'jaeger-collector'
      labels:
        app: 'jaeger'
    spec:
      ports:
        - name: 'jaeger-collector'
          port: 4317
          protocol: TCP
          targetPort: 4317
      selector:
        app: 'jaeger'

  - apiVersion: route.openshift.io/v1
    kind: Route
    metadata:
      labels:
        app: 'jaeger'
      name: jaeger-query
    spec:
      port:
        targetPort: 16686
      to:
        kind: Service
        name: jaeger-query
      wildcardPolicy: None
  - apiVersion: opentelemetry.io/v1alpha1
    kind: OpenTelemetryCollector
    metadata:
      name: simplest
    spec:
      config: |
        receivers:
          otlp:
            protocols:
              grpc:
              http:
        processors:

        exporters:
          logging:
          jaeger:
            endpoint: jaeger-all-in-one:14250
            tls:
              insecure: true
        service:
          pipelines:
            traces:
              receivers: [otlp]
              processors: []
              exporters: [logging,jaeger]

Some logs error from jaeger pod: [2023-04-05 10:31:34,255] INFO - i.v.i.o.u.watcher.OpenShiftObserver - *** Log from pod jaeger-1-dnb8x: {"level":"info","ts":1680683466.349247,"caller":"grpc@v1.54.0/clientconn.go:1119","msg":"[core][Channel #10 SubChannel #11] Subchannel Connectivity change to TRANSIENT_FAILURE, last error: connection error: desc = \"transport: Error while dialing: dial tcp :16685: connect: connection refused\"","system":"grpc","grpc_log":true} [2023-04-05 10:31:36,504] INFO - i.v.i.o.u.watcher.OpenShiftObserver - *** Log from pod jaeger-1-dnb8x: {"level":"info","ts":1680683467.3497033,"caller":"grpc@v1.54.0/clientconn.go:1119","msg":"[core][Channel #10 SubChannel #11] Subchannel Connectivity change to IDLE, last error: connection error: desc = \"transport: Error while dialing: dial tcp :16685: connect: connection refused\"","system":"grpc","grpc_log":true} [2023-04-05 10:31:36,805] INFO - i.v.i.o.u.watcher.OpenShiftObserver - *** Log from pod jaeger-1-dnb8x: {"level":"info","ts":1680683467.3497689,"caller":"grpc@v1.54.0/clientconn.go:428","msg":"[core][Channel #10] Channel Connectivity change to IDLE","system":"grpc","grpc_log":true}

On the other hand , this is some piece of my code where I try to create the spans and export them :

jaegerEndpoint = "http://jaeger-query-" +  openshift.getRoute(APP_NAME).getSpec().getHost() + ":4317";
// Export traces to Jaeger over OTLP
      OtlpGrpcSpanExporter jaegerOtlpExporter =
        OtlpGrpcSpanExporter.builder()
          .setEndpoint(jaegerEndpoint)
          .setTimeout(30, TimeUnit.SECONDS)
          .build();

      Resource serviceNameResource =
        Resource.create(Attributes.of(ResourceAttributes.SERVICE_NAME, JAEGER_SERVICE_NAME));

      // Set to process the spans by the Jaeger Exporter
      SdkTracerProvider tracerProvider =
        SdkTracerProvider.builder()
          .addSpanProcessor(BatchSpanProcessor.builder(jaegerOtlpExporter).build())
          .setResource(Resource.getDefault().merge(serviceNameResource))
          .build();

      OpenTelemetrySdk openTelemetrySdk =
        OpenTelemetrySdk.builder().setTracerProvider(tracerProvider).build();

    tracer = openTelemetrySdk.getTracer("io.vertx");

 @BeforeEach
  public void setSpan() throws InterruptedException {
    if(tracer!= null) {
      Span span = tracer.spanBuilder("another span use case").startSpan();
      span.setAttribute("attribute3", "I am the attribute3");
      span.addEvent("Event 3");
      Thread.sleep(800);
      span.addEvent("Event 4");
      span.end();
    } else {
      System.out.println("tracer us Null");
    }

  }

And one of my test :

@Test
  public void testJaegerTracesService() {
   await().atMost(Duration.ONE_MINUTE).untilAsserted(() -> {
        String tracesAPIUrl = "http://jaeger-query-opentelemetry-project.apps.my.openshift.cluster/api/traces?service=" + JAEGER_SERVICE_NAME;

        Response apiTraceServiceResponse = given()
          .get(tracesAPIUrl);
        apiTraceServiceResponse.then().statusCode(200)
          .body("data", hasSize(greaterThan(2)))
          .body(containsString("I am the attribute3"))
          .extract().response();
        System.out.println("apiTraceServiceResponse----> " + apiTraceServiceResponse.body().prettyPrint());
...

I checked in the logs that some traces are created in that jaeger endpoint but not my customer spans and I got the next error: `SEVERE: Failed to export spans. The request could not be executed. Full error message: connect timed out

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jpkrohling commented 1 year ago

I don't think this is about OpenTelemetry at all, or even the Collector. From what I could gather, the backend is purely Jaeger, so, it would be a question for them.

cc @frzifus, @pavolloffay, as apparently OpenShift is also involved, so, you might want to take a look

frzifus commented 1 year ago

@jcarranzan you may want to split this issue into two parts.

  1. try to export your telemetry data to an local opentelemetry collector. You can check if your data arrives there by using th logging exporter.
  2. Verify your custom Jaeger deployment works as expected. You can use telemetrygen and oc or kubectl port-forward to send traces to your jaeger instance.
jcarranzan commented 1 year ago

Thanks @frzifus, I tried to use telemetrygen but not sure how to send the traces to my jaeger instance on openshift. On the other hand, to do my config a bit more simple and workable currently I've configured 2 simplest yamls and it seems the deployment works but there is something missing yet.

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: my-jaeger
2023-04-10T10:23:41.499Z    info    service/telemetry.go:110    Setting up own telemetry...
2023-04-10T10:23:41.499Z    info    service/telemetry.go:140    Serving Prometheus metrics  {"address": ":8888", "level": "basic"}
2023-04-10T10:23:41.500Z    info    service/service.go:89   Starting otelcol... {"Version": "0.63.1", "NumCPU": 4}
2023-04-10T10:23:41.500Z    info    extensions/extensions.go:42 Starting extensions...
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:74   Starting exporters...
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:78   Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "jaeger"}
2023-04-10T10:23:41.500Z    info    jaegerexporter@v0.63.0/exporter.go:185  State of the connection with the Jaeger Collector backend   {"kind": "exporter", "data_type": "traces", "name": "jaeger", "state": "IDLE"}
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:82   Exporter started.   {"kind": "exporter", "data_type": "traces", "name": "jaeger"}
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:86   Starting processors...
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:98   Starting receivers...
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:102  Receiver is starting... {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2023-04-10T10:23:41.500Z    info    otlpreceiver/otlp.go:71 Starting GRPC server    {"kind": "receiver", "name": "otlp", "pipeline": "traces", "endpoint": "0.0.0.0:4317"}
2023-04-10T10:23:41.500Z    info    otlpreceiver/otlp.go:89 Starting HTTP server    {"kind": "receiver", "name": "otlp", "pipeline": "traces", "endpoint": "0.0.0.0:4318"}
2023-04-10T10:23:41.500Z    info    pipelines/pipelines.go:106  Receiver started.   {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2023-04-10T10:23:41.500Z    info    service/service.go:106  Everything is ready. Begin running and processing data.
2023-04-10T10:23:42.500Z    info    jaegerexporter@v0.63.0/exporter.go:185  State of the connection with the Jaeger Collector backend   {"kind": "exporter", "data_type": "traces", "name": "jaeger", "state": "READY"}

And the error that I got: WARNING: Failed to export spans. Server responded with HTTP status code 502. Error message:

I checked that @pavolloffay answered a similar error in the past but not sure if it's related to... Thanks!