I've been rolling out an OpenTelemetry-based observability solution for my Temporal app. The reason for using OTel is partly due to the Temporal Python SDK already using OTel for traces and metrics (in the SDK), so I want to adopt those SDKs for custom metrics and logging.
Everything works great in async activities. I've been able to use OTel tooling for logging and tracing (using the TracingInterceptor as seen in this example), I can see traces with their correlated logs for each activity in my backend (Grafana, Loki, Tempo). The Temporal SDK provides activity.metric_meter(), which I've used to add custom metrics to async activities.
However, I'm having several issues with sync activities running on process-pool-based workers (I'm happy to split them into separate issues):
The trace_id and span_id injected into the logs are incorrect for all except the first activity that runs on that worker. It seems that the first activity's IDs are injected into all activities that follow it.
Temporal SDK's activity.metric_meter() doesn't work in process-pools, which is clearly a known but probably related issue. I planned to set up a separate OTel MetricProvider to support custom metrics in my sync activities. However, this looks to have the same limitations (I haven't tried everything exhaustively yet).
Note: I suspect initialising the MeterProvider for each process/activity will be a lot more simple because it isn't attached to a global root logger.
While these issues are likely inherently within the OTel SDKs, the same issues are also known to be true for the TracingProvider (ref), yet Temporal managed to get that to work.
Please provide guidance on setting up OTel logging and custom metrics in process-pool-based workers or support them natively like you do OTel tracing.
Describe the solution you'd like
Support for the remaining OpenTelemetry SDKs (metric and logging) natively in both async, thread-pool, and process-pool workers.
Is your feature request related to a problem? Please describe.
Slack discussion
Hi,
I've been rolling out an OpenTelemetry-based observability solution for my Temporal app. The reason for using OTel is partly due to the Temporal Python SDK already using OTel for traces and metrics (in the SDK), so I want to adopt those SDKs for custom metrics and logging.
Everything works great in async activities. I've been able to use OTel tooling for logging and tracing (using the
TracingInterceptor
as seen in this example), I can see traces with their correlated logs for each activity in my backend (Grafana, Loki, Tempo). The Temporal SDK providesactivity.metric_meter()
, which I've used to add custom metrics to async activities.However, I'm having several issues with sync activities running on process-pool-based workers (I'm happy to split them into separate issues):
The
trace_id
andspan_id
injected into the logs are incorrect for all except the first activity that runs on that worker. It seems that the first activity's IDs are injected into all activities that follow it.activity.metric_meter()
doesn't work in process-pools, which is clearly a known but probably related issue. I planned to set up a separate OTelMetricProvider
to support custom metrics in my sync activities. However, this looks to have the same limitations (I haven't tried everything exhaustively yet).Note: I suspect initialising the
MeterProvider
for each process/activity will be a lot more simple because it isn't attached to a global root logger.While these issues are likely inherently within the OTel SDKs, the same issues are also known to be true for the TracingProvider (ref), yet Temporal managed to get that to work.
Please provide guidance on setting up OTel logging and custom metrics in process-pool-based workers or support them natively like you do OTel tracing.
Describe the solution you'd like
Support for the remaining OpenTelemetry SDKs (metric and logging) natively in both async, thread-pool, and process-pool workers.