Open crobertson-conga opened 2 years ago
@aishyandapalli this is an FYI, I think your new feature in regards to https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/12421 has a bug in it. I think its stemming from https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b30f3e9e5242b8f60839a05057bea9317e1caf25/exporter/loadbalancingexporter/trace_exporter.go#L121 consuming all traces instead of just the ones associated with the routing key.
Actually I'm not sure if that's the problem. I set up batching so that it had a max size of one and all my span metrics collectors are still getting signals across all services
batch/one: # super inefficient data-wise, but it looks like the loadbalancing exporter doesn't split properly
send_batch_size: 1
send_batch_max_size: 1
I have a resource processor on the span metrics processor that annotates the traces coming in, hence the aggregator dimension.
The collector doing span metrics is forwarded metrics from the loadbalancer exporter.
[Edge collectors] -> [Main central collector] -> [Spanmetrics collector(s)]
So doing some more testing leads me to believe it may be due to forcibly closed connections on grpc making the LB move to the next available instance. I will close this if it turns out to be the case.
Okay, so this was due to my configuration which was interrupting the grpc connection regularly. Sorry
Okay after removing my batching of size one, the issue reappeared. I had two problems, one which is resolved by not allowing connections to terminate artificially. The other is if the traces are in a batch with multiple service names, they get get sent to all target collectors with the loadbalancer processor.
This leads me to believe the original issue where all the spans are being sent per endpoint regardless of actual service is correct.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Describe the bug New loadbalancingexporter option for grouping traces by service name is sending all traces in a block every time instead of splitting up the set of traces to those that only belong to the specific export.
Steps to reproduce Use new
routing_key: service
option to start splitting up the traces by service. Have at least 2 receiving collectors. In the receiving collectors, use a resource detection processor to augment the trace payload so you can see which collector is receiving a trace.What did you expect to see? All traces from a specific service name should have the same receiving processor
What did you see instead? Traces from a specific service name went to both processors.
What version did you use? 0.59.0
What config did you use?
Environment Doesn't matter
Additional context Add any other context about the problem here.