GoogleCloudPlatform / opentelemetry-operations-python

OpenTelemetry Python exporters for Google Cloud Monitoring and Trace
https://google-cloud-opentelemetry.readthedocs.io/en/stable/
Apache License 2.0
63 stars 45 forks source link

Cloud Trace exporter adds seconds of latency to HTTP requests #75

Closed askmeegs closed 3 years ago

askmeegs commented 3 years ago

Hello, I work on the Bank of Anthos sample application. In July, we added the OpenTelemetry exporter and trace propagator to 3 Python services (all using flask/gunicorn, all running inside containers on GKE).

We have determined that the OT exporter seems to be causing multiple seconds of latency per request (findings here). We upgraded OT to the latest version (0.13b0) but we're still seeing latency in the app.

My hypothesis is that there is a process in the exporter that is slow on a per-request basis, or there is something going on with our GKE environment/Trace API authentication that is causing the latency.

Wondering if you've seen anything similar, or are doing any performance testing on these libraries. Thank you!

aabmass commented 3 years ago

@askmeegs from the PR, it looks like you are using SimpleExportSpanProcessor? This processor really slows things down because it runs all of the export calls synchronously without any batching.

Try swapping in BatchExportSpanProcessor instead and that should resolve your latency issues:

trace.get_tracer_provider().add_span_processor(
    BatchExportSpanProcessor(exporter)
)
askmeegs commented 3 years ago

Thank you for this suggestion, Aaron! I switched the 3 python services to use the BatchExportSpanProcessor and that seems to have fixed the latency issue. Closing this issue. Thanks!