Open dashpole opened 7 months ago
cc @ridwanmsharif @damemi
Have tried adding following to the env for the collector sidecar also and this is not helping either.
You would need to configure sampling in the auto-instrumentation agent, rather than in the collector, since the sampling decision is made by the auto-instrumentation agent. See https://opentelemetry.io/docs/languages/java/automatic/configuration/ for the configuration, and https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md#sampler for the sampler config for the agent.
Another suggestion, if the above doesn't fix it, is to remove the batch processor, as that can cause delay between the application and export.
Thanks for the quick response here. We are using Spring boot open telemetry starter here and therefore, have followed this link. https://opentelemetry.io/docs/languages/java/automatic/spring-boot/
Below is my gradle config.
implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter")
implementation("io.opentelemetry.contrib:opentelemetry-samplers:1.35.0-alpha")
implementation("io.opentelemetry.contrib:opentelemetry-gcp-resources:1.35.0-alpha") {
exclude group: 'com.fasterxml.jackson.core', module: 'jackson-core'
}
Below is my application property config where I am setting up the sampling.
management:
tracing:
sampling:
probability: 1.0
otlp:
metrics:
distribution:
slo.test:
timer: "10.0,100.0,500.0,1000.0"
percentiles:
test:
timer: "0.9,0.99"
percentiles-histogram:
test:
timer: true
export:
enabled: true
tracing:
export:
timeout: 5s
compression: gzip
And below is my OTEL config
otel:
exporter:
otlp:
protocol: grpc
instrumentation:
spring-webflux:
enabled: true
spring-web:
enabled: true
propagators:
- tracecontext
resource:
providers:
gcp:
enabled: true
attributes:
service.name: app-name-here
development.environment: gcp
Any idea what I am missing here?
I have already tried removing batch from the exporter and that makes no change.
I'm not particularly familiar with the config formats you've shared, but I suspect you need to set the sampler. The default from OTel is parent-based, always-on. Since cloud run creates a parent span and decides whether or not to sample, you will inherit the sampling decision that cloud run made, even if you have the sampling rate set at 100%
Is there a reason we can control setting on cloud run side? I hope you are getting my point. What there is a trace that we are interested in due to some reason and it does not show up in the google cloud traces?
I don't believe it is configurable on cloud run today, although cloud run does mostly respect the traceparent
header's sampling decision (up to point, and then will rate limit), so if requests to the cloud run service are already sampled, you should get nearly 100% sampling.
Our team found following link and this link shows that Cloud run does not support configuration of Cloud run sample rate. https://cloud.google.com/run/docs/trace#trace_sampling_rate At the same point it shows that request per service are sampled at 10 requests per second. For my testing, I am sure that I am not crossing this limit.
Original issue: https://github.com/open-telemetry/community/discussions/2082 by @sdsani
Hi Guys, Our team has been asked to deploy an application in google cloud run as a service. Among many other tasks, one action item is about setting up tracing and monitoring for the application. We decided here to use Open telemetry here. We have two services. One is a ReactJS app while second is a spring boot app. Both are using otel collector as a side car here. Below are the few docs that we have followed here to configure our application.
https://cloud.google.com/run/docs/tutorials/custom-metrics-opentelemetry-sidecar https://opentelemetry.io/docs/languages/java/automatic/spring-boot/ And many other docs.
We got it working at the end. We can see traces, showing up in google cloud traces. Metrics are getting into metrics and so on. However, during testing, we found that not all traces are getting into the google cloud trace. Sometime every second trace get into cloud trace, while sometime every fourth get into the cloud trace. We have looked many different settings and played around that, however, none of these settings appears to make any difference. To get this capability, we need a more reliable config in place since for production support, missing these traces would not be an option.
Below is the image that we are using for otel collector sidecar (pulled from dockerhub). otel/opentelemetry-collector-contrib:0.99.0
This image has been customized with a config file and attached is that config file (collector-config.yaml.txt) Attached is also yaml file for cloud run service config (cloud-run-service-config.yaml.txt) Please advise. cloud-run-service-config.yaml.txt collector-config.yaml.txt
In our spring boot config, our team is already setting sampling rate to 1.0. Have tried adding following to the env for the collector sidecar also and this is not helping either. env: