Closed xihw closed 6 years ago
a. ClientSamplingConfiguration says probabilistic 0.1, and CollectorSamplingConfiguration says probabilistic 0.2
b. ClientSamplingConfiguration says remote, and CollectorSamplingConfiguration says probabilistic 0.2
2) Any span that is reported by the service will be persisted, ie the decision is made once. In your example, the ClientSamplingConfiguration will be used instead of the CollectorSamplingConfiguration so the sampling probability will be 0.1. If you instead were to use sampler.type=remote
in the ClientSamplingConfiguration, then the client will use the CollectorSamplingConfiguration of 0.2. (client MUST be configured with sampler.type=remote
in order for it to receive sampling rates from the collector, or else it will use the sampling rate provided by the service owner)
Any span that is reported by the service will be persisted
You mean persisted into storage?
In your example, the ClientSamplingConfiguration will be used instead of the CollectorSamplingConfiguration so the sampling probability will be 0.1
Sorry still confused when is the 0.1 used ? service -> agent or agent -> collector or collector -> storage ? or all of them (if all of them then finally 0.1 0.1 0.1 will be stored in DB right) ?
@black-adder
The sampling rate is only used at the service, 0.1 of traces will be stored in the DB.
Ok! so the sampling only happens in service before sending spans out. One more question:
configuration (and soon adaptive calculations) come from the collectors, but clients receive them via agent
What is the flow ? From service's standpoint, is it pull / push ? And when does it happen? Once when service is up or periodically ?
services pulls from agent every minute, this is configurable: https://github.com/jaegertracing/jaeger-client-go/blob/master/config/config.go#L86
We haven't done this yet but I've always wanted to do push. It's on my personal road map.
Can you also help me understand another sampling propagation question -- Will a service generate a span for incoming request before deciding sample or not sample it ? Will unsampled span propagated between services ?
Sampling and generation of a span happens roughly at the same time. Context is always propagated between services (even if unsampled).
If service B receives a request with context saying something like {"span_a", "unsampled"}, B will still create a span as child of "span_a" and propagate continuously , but won't report it, is it correct ?
yes
Ok so does it mean that it's possible for every request being traced by putting it's span info into the log even though we do sampling ? If so do you have any resource showing how to do that ?
I'm not sure I understand the question. Are you asking if span logs are always persisted even if we do sampling?
I'm asking is it possible to use some logging framework like MDC (http://www.baeldung.com/mdc-in-log4j-2-logback) to log the trace id for every single request even if we do sampling.
yes you can log the trace id for every request but since you're sampling, some logs will have trace ids without a persisted trace.
This is a golang example: https://github.com/jaegertracing/jaeger/blob/master/examples/hotrod/pkg/log/spanlogger.go
however, here we're doing more than just logging the traceid, we're dual logging to both the log reporter and into the span.
closing issue, feel free to open if you have more questions
1. agent proxies the requests to the collector, so that the client does not need to know where collectors are located (agent is usually on the localhost)
The agent proxies the config request to the collector through which connection? The TChannel or gRPC, whichever one is connected? These docs don't really explain where the sampling config is sent through: https://www.jaegertracing.io/docs/1.14/getting-started/#all-in-one https://www.jaegertracing.io/docs/1.14/deployment/#collectors Also it would be a nice improvement to know which are encryptable or encrypted by default, This page does not detail which protocol between which component has encryption support: https://github.com/jaegertracing/jaeger/issues/458
Thank you.
The agent proxies the config request to the collector through which connection? The TChannel or gRPC, whichever one is connected?
whichever one you configure on the agent. We recommend gRPC.
Also it would be a nice improvement to know which are encryptable or encrypted by default,
See #1718
These docs don't really explain where the sampling config is sent through:
Can you elaborate what can be improved in the docs? If you're using remote
sampler, then the sampling configuration is defined in the collectors, and is pulled by the clients periodically client<-agent<-collector
Thanks, now it's all coming together through different info from the different refered github issues.
Can you elaborate what can be improved in the docs? If you're using
remote
sampler, then the sampling configuration is defined in the collectors, and is pulled by the clients periodicallyclient<-agent<-collector
For improvements to the doc, here are ideas:
Thanks.
Does 'remote' sampling work with http-sender? In my aks cluster setup, I haven't configured 'jaeger-agent'.
Sampler has nothing to do with Sender, it's an independent component. It can work with both the agent and the collector.
Thanks Yuri for the quick response. I really appreciate your help on this.
I'm using Jaeger K8s operators and has following sampling strategy in the configmap: _
apiVersion: v1 data: sampling: '{"default_strategy":{"operation_strategies":[{"operation":"/health","param":0,"type":"probabilistic"},{"operation":"/metrics","param":0,"type":"probabilistic"}],"param":0.1,"type":"probabilistic"}}' kind: ConfigMap metadata: creationTimestamp: "2020-06-15T23:42:14Z" labels: app: jaeger app.kubernetes.io/component: sampling-configuration app.kubernetes.io/instance: jaeger app.kubernetes.io/managed-by: jaeger-operator app.kubernetes.io/name: jaeger-sampling-configuration app.kubernetes.io/part-of: jaeger name: jaeger-sampling-configuration namespace: monitoring
*** We're using monitoring namespace instead of observability. _
Client application has following properties:
**
sampler.type=const sampler.sampling-rate=1
**
Since these properties are defined in the application's properties file, I'm overriding using k8s environment variables. I have set sampler.type to remote. As I don't know what value should be given to sampling-rate when sampler.type is set to remote, I set it as 1
With this when I created the pod, every sample is being collected. I'm not sure why it is not honoring remote configuration.
Am I missing anything?
The numeric value of 1 is treated as 100% default probability when the sampler cannot contact the backend. It's possible that in your deployment it cannot reach the backend and never gets the 0.1 probability. The sampler emits metrics about unsuccessful configuration pulls.
1. agent proxies the requests to the collector, so that the client does not need to know where collectors are located (agent is usually on the localhost) 2. yes, configuration (and soon adaptive calculations) come from the collectors, but clients receive them via agent 3. remotely controlled samplers are only supported by Jaeger clients, not Zipkin clients. 4. Not sure which "batch" you are referring to.
Dear yuri, I have a question, if remotely contorlled samplers are only suporter via agent, and agent pulls config via gRPC+protobuff. Then what is the sampling.thrift for?
Previously agent was using Thrift to retrieve sampling from collector. Not it uses protobuf, but the clients consume sampling as JSON, and that JSON is still generated from Thrift.
Previously agent was using Thrift to retrieve sampling from collector. Not it uses protobuf, but the clients consume sampling as JSON, and that JSON is still generated from Thrift.
You mean the sample strategies are sent to agents from collector via thrift previously but via protobuff+gRPC now ? I know client get sampling json using http+5778 port. So I care about how collector sent them to agent.
collector to agent is grpc
collector to agent is grpc
Thanks, yuri.
The numeric value of 1 is treated as 100% default probability when the sampler cannot contact the backend. It's possible that in your deployment it cannot reach the backend and never gets the 0.1 probability. The sampler emits metrics about unsuccessful configuration pulls.
I'm not sure this is completely correct. Or there is a bug in this code path. I'm setting sampler type to remote and leaving the param yet, the param value is being set to 1 by default even when the remote actually has a param of 0.5. Seems like a bug to me.
" Remote (sampler.type=remote, which is also the default) sampler consults Jaeger agent for the appropriate sampling strategy to use in the current service. This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend, or even dynamically (see Adaptive Sampling). "
This is excerpted from Jaeger Doc and it looks pretty confusing to me. Can you help me to understand it with the following questions ?
"consults Jaeger agent for the appropriate sampling strategy" -- As I know there are two places to configure sampling rate: jaeger-client and jaeger-collector. What role does Jaeger agent play here?
"This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend" -- Does "a central configuration in Jaeger backend" mean jaeger-collector ?
What if we use zipkin-client + jaeger backend (jaeger-collector + jaeger-ui + storage) ? In this case we don't have jaeger-agent running, how does the "remote consulting" work ?
A follow-up question on 3. Without jaeger-agent, how is batch handled ? According to zipkin's doc: https://zipkin.io/pages/architecture.html, I interpret "Transport" as "jaeger-agent", in zipkin-client + jaeger backend scenario, do we discard "Transport" ?