open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.72k stars 792 forks source link

Using @opentelemetry/exporter-otlp-http with last collector version (0.37.1) #2567

Closed arquadrado closed 2 years ago

arquadrado commented 2 years ago

Hi, this is mostly a support request since I've been navigating through the documentation for the past week and I feel totally lost.

Here's the context, I have a node js micro-service done with Fastify and I am trying add observability to it using opentelemetry.

I have a kubernetes cluster where I set up the micro-service, a collector service, zipkin and prometheus, to display traces and metrics respectively. My service setup is pretty simple, mostly registering plugins. I created two plugins, one for metrics and other for tracing which I then regist in fastify. Here's a sample of the tracing plugin:

module.exports = fp(async function (fastify, opts) {
    fastify.log.debug('Starting tracing');

    const provider = new NodeTracerProvider({
        resource: new Resource({
            [SemanticResourceAttributes.SERVICE_NAME]: 'my-service'
        })
    });
    provider.addSpanProcessor(
        new BatchSpanProcessor(
            new OTLPTraceExporter({
                url: 'http://otel-collector-srv:55681/v1/traces'
            })
        )
    );
    provider.register();

    return fastify.register(openTelemetryPlugin, { wrapRoutes: true });
});

For the url I provide the cluster address of the collector. Now comes the weird part, if I am using the collector image otel/opentelemetry-collector:0.25.0 this works and I am able to see my traces in zipkin and my metrics in prometheus but if I upgrade the image version to 0.37.1 the most recent at the time of writing the it stops working. The collector is supposedly up and running, but it doesn't do nothing with the data sent. No metrics/traces whatsoever are displayed in prometheus or zipkin and no logs regarding the metrics in the collector itself. I don't know if the api has changed, if the exporter-otlp library is outdated and doesn't send the data in the expected format or whether something else is missing and I couldn't find any help in the docs which seem a bit inconsistent some times.

Could please someone point me in right direction or explain me what's happening?

I attach my collector config in case that's of any help:

receivers:
      otlp:
        protocols:
          http:
            cors_allowed_origins:
              - http://*
              - https://*

    processors: 
      memory_limiter:
        check_interval: 1s
        limit_mib: 2000
      batch:

    exporters:
      logging:
        loglevel: debug
      # otlp/elastic:
      #   endpoint: 'apm-srv:8200'
      zipkin:
        endpoint: "http://zipkin-srv:9411/api/v2/spans"
      prometheus:
        endpoint: "0.0.0.0:9464"

    extensions:
      health_check:

    service:
      extensions: [health_check]
      pipelines:
        metrics:
          receivers: 
            - otlp
          processors:
            - batch
          exporters:
            - logging
            - prometheus
            # - otlp/elastic
        traces:
          receivers:
            - otlp
          processors:
            - batch
          exporters:
            - logging
            - zipkin
            # - otlp/elastic
arquadrado commented 2 years ago

To add a bit more information to this post, I added the zpages extension to mu collector configuration and after checking it I noticed that what isn't actually working is just the metrics. In the zpages I see that there requests to the /v1/metrics endpoint but all of these request are met with a 400 which leaves me to think that the exporter library isn't sending the metrics data in the correct format. I have tried with @opentelemetry/exporter-collector:0.25.0 and @opentelemetry/exporter-otlp-http:0.26.0 which are the most recent versions I could find.

Does this make sense? Is there another library that I can use to send metric data to the collector?

Here's what I can see in the zpage:

image

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] commented 2 years ago

This issue was closed because it has been stale for 14 days with no activity.