pragmaticivan / nestjs-otel

OpenTelemetry (Tracing + Metrics) module for Nest framework (node.js) 🔭
Apache License 2.0
562 stars 48 forks source link

apiMetrics cannot be used with OpenTelemetry Prometheus exporter when getNodeAutoInstrumentations is used #464

Open mjarosie opened 9 months ago

mjarosie commented 9 months ago

When the module is configured to expose api metrics:

OpenTelemetryModule.forRoot({
  metrics: {
    apiMetrics: {
      enable: true,
    },
  },
}),

and when getNodeAutoInstrumentations is used:

export const otelSDK: NodeSDK = new NodeSDK({
    metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: `${OTEL_OTLP_HTTP_ENDPOINT}/v1/metrics`,
    }),
  }),
  instrumentations: [getNodeAutoInstrumentations()],
})

OpenTelemetry Collector Prometheus exporter throws the following error (broken down into multiple lines for readability):

error gathering metrics: collected metric http_server_duration_milliseconds 
label:{name:"http_flavor"  value:"1.1"}  
label:{name:"http_method"  value:"GET"}  
label:{name:"http_route"  value:"/v1/products"}  
label:{name:"http_scheme"  value:"http"}  
label:{name:"http_status_code"  value:"200"}  
label:{name:"job"  value:"my-service"}  
label:{name:"net_host_name"  value:"localhost"}  
label:{name:"net_host_port"  value:"5003"} 
histogram:{sample_count:49  sample_sum:898.520958  
  bucket:{cumulative_count:0  upper_bound:0}  
  bucket:{cumulative_count:0  upper_bound:5}  
  bucket:{cumulative_count:0  upper_bound:10}  
  bucket:{cumulative_count:46  upper_bound:25}  
  bucket:{cumulative_count:47  upper_bound:50}  
  bucket:{cumulative_count:48  upper_bound:75}  
  bucket:{cumulative_count:49  upper_bound:100}  
  bucket:{cumulative_count:49  upper_bound:250}  
  bucket:{cumulative_count:49  upper_bound:500}  
  bucket:{cumulative_count:49  upper_bound:750}  
  bucket:{cumulative_count:49  upper_bound:1000}  
  bucket:{cumulative_count:49  upper_bound:2500}  
  bucket:{cumulative_count:49  upper_bound:5000}  
  bucket:{cumulative_count:49  upper_bound:7500}  
  bucket:{cumulative_count:49  upper_bound:10000}
} has help "Measures the duration of inbound HTTP requests." but should have "The duration of the inbound HTTP request"

This is caused by the difference between descriptions exposed by opentelemetry-instrumentation-http ('Measures the duration of inbound HTTP requests.') and nestjs-otel ('The duration of the inbound HTTP request').

Should the description defined in nestjs-otel match some sort standard convention?

mjarosie commented 9 months ago

After further digging it turns out there's already a bug reported in OpenTelemetry repository.

I've also found the document describing Semantic Conventions for HTTP metrics which I believe this ticket is about? Here's the migration plan for already existing traces and metrics.

pragmaticivan commented 6 months ago

Yeah, so I've been waiting for that semantic convention latest version to be implemented for a while (roughly when I started that lib). It seems to be finally progressing