temporalio / sdk-python

Temporal Python SDK
MIT License
470 stars 74 forks source link

[Feature Request] OpenTelemetry Metrics OLTP/HTTP support #647

Open slingshotvfx opened 2 months ago

slingshotvfx commented 2 months ago

Is your feature request related to a problem? Please describe.

Right now the Python SDK (and possibly other Core SDKs?) only support OLTP over gRPC, but some services (like Pydantic Logfire) require OLTP over HTTPS

When trying to export metrics to these services today, I receive the following error:

OpenTelemetry metrics error occurred. Metrics exporter otlp failed with the grpc server returns error (Internal error): , detailed error message: protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 403 Forbidden

Describe the solution you'd like

Support for exporting OTel Metrics to services via OLTP/HTTP.

Additional context

This would bring the Python SDK up to parity with the Go SDK and possibly others which do support HTTP.

cretz commented 2 months ago

That is unfortunate that such a platform doesn't support such a common OTLP approach. This is implemented in our Rust Core layer, so I have opened an issue over there to support HTTP: https://github.com/temporalio/sdk-core/issues/820. Once updated there we will apply here (may just be an http=True type of option on the OTel config).

samuelcolvin commented 2 months ago

Sorry temporal, we'll try to get grpc added as soon as possible.

gregbrowndev commented 1 month ago

@slingshotvfx as a workaround, I would recommend deploying an OTEL collector which can receive grpc/http telemetry and export to Logfire via HTTP. This would be a better solution in the long run, as you ideally don't want to send telemetry directly to the backend via OTLP from your app. A collector allows you to add batching, retry/fault tolerance, and transformations to customise the metric labels.

Here's a quick example in Docker Compose (not fully tested) using Grafana Alloy (I prefer it over the standard OTel collector for its UI and live debugging):

Details

_docker-compose.yaml_: ```yaml services: alloy: image: grafana/alloy:v1.3.1 environment: - OTLP_EXPORTER_ENDPOINT="https://log-fire.com/whatever" ports: - "4317:4317" # OTLP grpc receiver - "4318:4318" # OTLP http receiver - "12345:12345" # Debug UI volumes: - alloy_gateway:/var/lib/alloy/data - type: bind source: ./config.alloy target: /etc/alloy/config.alloy command: - "run" - "--server.http.listen-addr=0.0.0.0:12345" - "--storage.path=/var/lib/alloy/data" - "--stability.level=experimental" # enables live debug mode - "/etc/alloy/config.alloy" ``` _config.alloy_: ```alloy livedebugging { // Enable live debugging for the collector // See http://localhost:12345 enabled = true } otelcol.receiver.otlp "default" { // Setup OTLP receiver to accept metrics, traces, and logs from instrumented services // https://grafana.com/docs/alloy/latest/reference/components/otelcol/otelcol.receiver.otlp/ grpc { endpoint = "0.0.0.0:4317" } http { endpoint = "0.0.0.0:4318" } output { metrics = [otelcol.processor.batch.input] # traces = [] # logs = [] } } otelcol.processor.batch "metrics" { // Adds batching to metrics // https://grafana.com/docs/alloy/latest/reference/components/otelcol/otelcol.processor.batch/ timeout = "200ms" output { metrics = [otelcol.exporter.otlphttp.logfire.input] } } // Can add additional components as needed otelcol.exporter.otlphttp "logfire" { // Exports telemetry to another OTLP receiver via http // https://grafana.com/docs/alloy/latest/reference/components/otelcol/otelcol.exporter.otlphttp/ client { endpoint = env("OTLP_EXPORTER_ENDPOINT") // You'll likely need to configure auth / tls to push to your backend } } ``` Then you just need to configure your Python app to send telemetry to `http://alloy:4317` over grpc