open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.74k stars 804 forks source link

Zipkin Exporter - Cannot export spans because timestamp in annotations is not an integer #4165

Closed FelipeEmerim closed 1 year ago

FelipeEmerim commented 1 year ago

What happened?

Exporting error spans to Zipkin results in a 400 error because zipkin cannot parse the annotations timestamp value.

Steps to Reproduce

We can reliably reproduce this using otel-collector zipkin receiver and forcing an http span to throw an error.

Expected Result

We expected the span to be exported correctly

Actual Result

The spans are not exported because of a 400 error.

Additional Details

This error was partially fixed in 1.16, but we believe the rounding should also be applied in the _toZipkinAnnotations function of the zipkin exporter. We can open a Pull Request containing the fix but before doing so we would like to confirm if this is actually a bug or if we are missing something.

OpenTelemetry Setup Code

// opentelemetry.ts
import { B3InjectEncoding } from '@opentelemetry/propagator-b3';
import { OpentelemetryBuilder } from '@randondigital/opentelemetry';
import { IncomingMessage } from 'http';
import { diag, DiagConsoleLogger, DiagLogLevel } from '@opentelemetry/api';
import env from './modules/main/app.env';

// Setting the default Global logger to use the Console

// And optionally change the logging level (Defaults to INFO)

diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);
const builder = new OpentelemetryBuilder();

builder
  .b3Propagator()
  .b3Propagator({ injectEncoding: B3InjectEncoding.MULTI_HEADER })
  .w3CPropagator()
  .setServiceName(env.OPENTELEMETRY_SERVICE)
  .setTracerName('project')
  .httpInstrumentation({
    ignoreIncomingRequestHook: (request: IncomingMessage) => {
      /**
       * The property is nullable. We are not excluding null
       * values from telemetry because we don't know in which
       * cases this value is null. Even when no path is supplied
       * this property has the value '/'.
       */
      if (!request.url) {
        return false;
      }

      return request.url.startsWith('/healthz/');
    },
  })
  .knexInstrumentation()
  .winstonInstrumentation();

if (env.OPENTELEMETRY_EXPORT) {
  builder.zipkinExporter({
    url: env.OPENTELEMETRY_EXPORT_URL,
  });
}

export const tracer = builder.build();

// builder.ts
/* eslint-disable global-require */
import opentelemetry, { TextMapPropagator, Tracer } from '@opentelemetry/api';
import {
  CompositePropagator,
  W3CBaggagePropagator,
  W3CTraceContextPropagator,
} from '@opentelemetry/core';
import {
  InstrumentationOption,
  registerInstrumentations,
} from '@opentelemetry/instrumentation';
import { Resource } from '@opentelemetry/resources';
import {
  BatchSpanProcessor,
  SpanExporter,
} from '@opentelemetry/sdk-trace-base';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

export class OpentelemetryBuilder {
  private instrumentations: InstrumentationOption[] = [];

  private serviceName: string;

  private tracerName: string;

  private propagators: TextMapPropagator[] = [];

  private exporters: SpanExporter[] = [];

  constructor() {
    this.reset();
  }

  reset(): void {
    this.instrumentations = [];
    this.propagators = [];
    this.exporters = [];
    this.serviceName = '';
    this.tracerName = '';
  }

  /**
   * Set opentelemetry service name
   * @param serviceName The service name
   * @returns
   */
  setServiceName(serviceName: string): this {
    this.serviceName = serviceName;
    return this;
  }

  /**
   * Set the tracer name
   *
   * The tracer is used to create new spans. It should be
   * the project name in package.json
   *
   * @param tracerName The tracer name
   * @returns
   */
  setTracerName(tracerName: string): this {
    this.tracerName = tracerName;
    return this;
  }

  /**
   * Enable httpInstrumentation for the application.
   *
   * @param config Configuration accepted by \@opentelemetry/instrumentation-http
   * @link https://www.npmjs.com/package/@opentelemetry/instrumentation-http?activeTab=readme
   *
   * @returns
   */
  httpInstrumentation(config?: object): this {
    const {
      HttpInstrumentation,
    } = require('@opentelemetry/instrumentation-http');
    this.instrumentations.push(new HttpInstrumentation(config));
    return this;
  }

  /**
   * Enable knex instrumentation for the application.
   * @param config Configuration accepted by \@opentelemetry/instrumentation-knex
   * @link https://www.npmjs.com/package/@opentelemetry/instrumentation-knex?activeTab=readme
   * @returns
   */
  knexInstrumentation(config?: object): this {
    const {
      KnexInstrumentation,
    } = require('@opentelemetry/instrumentation-knex');
    this.instrumentations.push(new KnexInstrumentation(config));
    return this;
  }

  /**
   * Enable azure instrumentation for the application.
   * @param config Configuration accepted by AzureSdkInstrumentation
   * @link https://www.npmjs.com/package/@azure/opentelemetry-instrumentation-azure-sdk
   * @returns
   */
  azureInstrumentation(config?: object): this {
    const {
      createAzureSdkInstrumentation,
    } = require('@azure/opentelemetry-instrumentation-azure-sdk');

    this.instrumentations.push(createAzureSdkInstrumentation(config));
    return this;
  }

  /**
   * Enable Sequelize instrumentation for the application.
   * @param config Configuration accepted by opentelemetry-instrumentation-sequelize
   * @link https://www.npmjs.com/package/opentelemetry-instrumentation-sequelize?activeTab=readme
   * @returns
   */
  sequelizeInstrumentation(config?: object): this {
    const {
      SequelizeInstrumentation,
    } = require('opentelemetry-instrumentation-sequelize');
    this.instrumentations.push(new SequelizeInstrumentation(config));
    return this;
  }

  /**
   * Enable kafkaJS instrumentation for the application.
   *
   * @param config Configuration accepted by opentelemetry-instrumentation-kafkajs
   * @link https://www.npmjs.com/package/opentelemetry-instrumentation-kafkajs?activeTab=readme
   * @returns
   */
  kafkaJSInstrumentation(config?: object): this {
    const {
      KafkaJsInstrumentation,
    } = require('opentelemetry-instrumentation-kafkajs');
    this.instrumentations.push(new KafkaJsInstrumentation(config));
    return this;
  }

  /**
   * Enable winston instrumentation for the application.
   *
   * @param config Configuration accepted by \@opentelemetry/instrumentation-winston
   * @link https://www.npmjs.com/package/@opentelemetry/instrumentation-winston?activeTab=readme
   * @returns
   */
  winstonInstrumentation(config?: object): this {
    const {
      WinstonInstrumentation,
    } = require('@opentelemetry/instrumentation-winston');
    this.instrumentations.push(new WinstonInstrumentation(config));
    return this;
  }

  /**
   * Enable socket.io instrumentation for the application.
   *
   * @param config Configuration accepted by @opentelemetry/instrumentation-socket.io
   * @link https://www.npmjs.com/package/@opentelemetry/instrumentation-socket.io?activeTab=readme
   * @returns
   */
  socketIoInstrumentation(config?: object): this {
    const {
      SocketIoInstrumentation,
    } = require('@opentelemetry/instrumentation-socket.io');
    this.instrumentations.push(new SocketIoInstrumentation(config));
    return this;
  }

  /**
   * Enable TypeORM instrumentation for the application.
   *
   * @param config Configuration accepted by opentelemetry-instrumentation-typeorm
   * @link https://www.npmjs.com/package/opentelemetry-instrumentation-typeorm
   * @returns
   */
  typeOrmInstrumentation(config?: object): this {
    const {
      TypeormInstrumentation,
    } = require('opentelemetry-instrumentation-typeorm');
    this.instrumentations.push(new TypeormInstrumentation(config));
    return this;
  }

  /**
   * Enable the B3 propagator for the application.
   *
   * @param config Configuration accepted by \@opentelemetry/propagator-b3
   * @link https://www.npmjs.com/package/@opentelemetry/propagator-b3?activeTab=readme
   * @returns
   */
  b3Propagator(config?: object): this {
    const { B3Propagator } = require('@opentelemetry/propagator-b3');
    this.propagators.push(new B3Propagator(config));
    return this;
  }

  /**
   * Enable the W3C propagator for the application.
   *
   * @returns
   */
  w3CPropagator(): this {
    this.propagators.push(new W3CTraceContextPropagator());
    this.propagators.push(new W3CBaggagePropagator());
    return this;
  }

  /**
   * Enable the Zipkin Exporter for the application.
   *
   * @param config Configuration accepted by \@opentelemetry/exporter-zipkin
   * @link https://www.npmjs.com/package/@opentelemetry/exporter-zipkin?activeTab=readme
   * @returns
   */
  zipkinExporter(config?: object): this {
    const { ZipkinExporter } = require('@opentelemetry/exporter-zipkin');

    this.exporters.push(new ZipkinExporter(config));
    return this;
  }

  /**
   * Build and return tracer object.
   *
   * @returns Tracer object to create spans. You should export it.
   */
  build(): Tracer {
    opentelemetry.propagation.setGlobalPropagator(
      new CompositePropagator({
        propagators: this.propagators,
      }),
    );

    const provider = new NodeTracerProvider({
      resource: new Resource({
        [SemanticResourceAttributes.SERVICE_NAME]: this.serviceName,
      }),
    });

    provider.register();

    registerInstrumentations({
      instrumentations: this.instrumentations,
      tracerProvider: provider,
    });

    this.exporters.forEach((exporter) => {
      provider.addSpanProcessor(new BatchSpanProcessor(exporter));
    });

    return opentelemetry.trace.getTracer(this.tracerName);
  }
}

package.json

// We used npm ls in the project to make sure we are using 1.16 in the root and all dependencies.
{
  "name": "project"
  "version": "1.32.5",
  "description": "project",
  "author": "Author",
  "private": true,
  "license": "UNLICENSED",
  "engines": {
    "node": ">=18",
    "npm": ">=8"
  },
  "dependencies": {
    "@nestjs/common": "^10.2.4",
    "@nestjs/core": "^10.2.4",
    "@nestjs/platform-express": "^10.2.4",
    "@opentelemetry/api": "^1.4.1",
    "@opentelemetry/core": "^1.16.0",
    "@opentelemetry/exporter-zipkin": "^1.16.0",
    "@opentelemetry/instrumentation": "^0.42.0",
    "@opentelemetry/instrumentation-http": "^0.42.0",
    "@opentelemetry/instrumentation-knex": "^0.32.1",
    "@opentelemetry/instrumentation-winston": "^0.32.1",
    "@opentelemetry/propagator-b3": "^1.16.0",
    "@opentelemetry/resources": "^1.16.0",
    "@opentelemetry/sdk-trace-base": "^1.16.0",
    "@opentelemetry/sdk-trace-node": "^1.16.0",
    "@opentelemetry/semantic-conventions": "^1.16.0",
    "@randondigital/commons": "^3.3.2",
    "@randondigital/environment": "^2.2.3",
    "@randondigital/nestjs": "10.0.1",
    "@randondigital/opentelemetry": "^4.3.0",
    "@randondigital/validation": "6.0.1",
    "class-transformer": "^0.5.1",
    "class-validator": "^0.14.0",
    "dotenv": "^16.3.1",
    "knex": "^2.5.1",
    "nest-winston": "^1.9.4",
    "pg": "^8.11.3",
    "rimraf": "^5.0.1",
    "rxjs": "^7.8.1",
    "winston": "^3.10.0"
  },
  "overrides": {
    "glob-parent": "6.0.2",
    "@nestjs/platform-express": {
      "multer": {
        ".": "1.4.5-lts.1",
        "busboy": "^1.6.0"
      }
    }
  },
  "devDependencies": {
    "@commitlint/cli": "^17.7.1",
    "@commitlint/config-conventional": "^17.7.0",
    "@nestjs/cli": "^10.1.17",
    "@nestjs/schematics": "^10.0.2",
    "@nestjs/testing": "^10.2.4",
    "@randondigital/commitlint-config": "^3.1.5",
    "@types/chai": "^4.3.5",
    "@types/common-tags": "^1.8.1",
    "@types/dompurify": "3.0.2",
    "@types/express": "^4.17.17",
    "@types/jest": "^29.5.4",
    "@types/node": "^20.5.7",
    "@types/sinon": "^10.0.16",
    "@types/supertest": "^2.0.12",
    "@types/validator": "^13.11.1",
    "@typescript-eslint/eslint-plugin": "^6.5.0",
    "@typescript-eslint/parser": "^6.5.0",
    "chai": "^4.3.8",
    "eslint": "^8.48.0",
    "eslint-config-airbnb-base": "^15.0.0",
    "eslint-config-airbnb-typescript": "^17.1.0",
    "eslint-config-prettier": "^9.0.0",
    "eslint-plugin-import": "^2.28.1",
    "eslint-plugin-prettier": "^5.0.0",
    "husky": "^8.0.3",
    "jest": "^29.6.4",
    "jest-junit": "^16.0.0",
    "prettier": "^3.0.3",
    "reflect-metadata": "^0.1.13",
    "sinon": "^15.2.0",
    "supertest": "^6.3.3",
    "ts-jest": "^29.1.1",
    "ts-loader": "^9.4.4",
    "ts-node": "^10.9.1",
    "tsconfig-paths": "^4.2.0",
    "typescript": "^5.2.2"
  }
}

Relevant log output

Non-opentelemetry related fields were redacted

Zipkin request payload: [{"traceId":"a8076b1fa82e93a05db85c2048baf28b","parentId":"cd23faf2561ecdac","name":"select table","id":"0a6327fd0845b118","timestamp":1695822117622000,"duration":1506,"localEndpoint":{"serviceName":"project"},"tags":{"knex.version":"2.5.1","db.system":"system","db.sql.table":"table","db.operation":"select","db.user":"user","db.name":"db","net.peer.name":"db-host","net.peer.port":"5432","db.statement":"fake query","otel.status_code":"ERROR","error":"error message here","service.name":"project","telemetry.sdk.language":"nodejs","telemetry.sdk.name":"opentelemetry","telemetry.sdk.version":"1.16.0"},"annotations":[{"timestamp":1695822117623492.5,"value":"exception"}]},{"traceId":"a8076b1fa82e93a05db85c2048baf28b","parentId":"5db85c2048baf28b","name":"GET","id":"cd23faf2561ecdac","kind":"SERVER","timestamp":1695822117620000,"duration":4109,"localEndpoint":{"serviceName":"project"},"tags":{"http.url":"http://some-url/some-route","http.host":"host","net.host.name":"host","http.method":"GET","http.scheme":"http","http.client_ip":"some-ip","http.target":"target","http.user_agent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0","http.flavor":"1.1","net.transport":"ip_tcp","net.host.ip":"some-ip","net.host.port":"3000","net.peer.ip":"127.0.0.6","net.peer.port":"57029","http.status_code":"500","http.status_text":"INTERNAL SERVER ERROR","otel.status_code":"ERROR","service.name":"project","telemetry.sdk.language":"nodejs","telemetry.sdk.name":"opentelemetry","telemetry.sdk.version":"1.16.0"}}]
@opentelemetry/instrumentation-http outgoingRequest on response()
@opentelemetry/instrumentation-http outgoingRequest on end()
Zipkin response status code: 400, body: json: cannot unmarshal number 1695822117623492.5 into Go struct field .annotations of type uint64

{"stack":"Error: Got unexpected status code from zipkin: 400\n    at IncomingMessage.<anonymous> (/home/nonroot/node_modules/@opentelemetry/exporter-zipkin/build/src/platform/node/util.js:61:32)\n    at /home/nonroot/node_modules/@opentelemetry/context-async-hooks/build/src/AbstractAsyncHooksContextManager.js:50:55\n    at AsyncLocalStorage.run (node:async_hooks:327:14)\n    at AsyncLocalStorageContextManager.with (/home/nonroot/node_modules/@opentelemetry/context-async-hooks/build/src/AsyncLocalStorageContextManager.js:33:40)\n    at IncomingMessage.contextWrapper (/home/nonroot/node_modules/@opentelemetry/context-async-hooks/build/src/AbstractAsyncHooksContextManager.js:50:32)\n    at IncomingMessage.emit (node:events:526:35)\n    at endReadableNT (node:internal/streams/readable:1359:12)\n    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)","message":"Got unexpected status code from zipkin: 400","name":"Error"}
dyladan commented 1 year ago

We can open a Pull Request containing the fix but before doing so we would like to confirm if this is actually a bug or if we are missing something.

I don't think you're missing anything. A PR would be appreciated.