Closed jameshartig closed 6 days ago
We've gone with a limit of 20,000 for now. I'll report back if we have issues.
Thanks for raising the issue. We'll try to figure out good defaults and update the documentation.
PatchTraces
is the v1 cloud trace API while BatchWriteSpans
is v2 which our exporter uses. There doesn't actually seem to be any limit on BatchWriteSpans. I'll work on a summary of @dashpole's performance tuning findings from #456 for the readme
The trace Readme.md doesn't call
WithMaxExportBatchSize
which means the export batch size is the default, 512. This means that every 512 spans the batch span processor will synchronously [1] export the spans during which new spans will be queued up to theMaxQueueSize
which defaults to 2048. If you are regularly doing 1000s of spans per second you will end up callingbatchWriteSpans
several times a second needlessly using up the 4,800/minute quota [2] and if any of those RPC calls takes more than a couple hundred milliseconds you will end up dropping spans.The quotas page [2] says that the "Maximum number of spans per
PatchTraces
call" is 25,000 (assuming thatPatchTraces
is the same asbatchWriteSpans
). Does that mean we can set the batch export size to 25,000? I know that many of Google's RPC methods have internal request size limits and I'm concerned that setting it at 25,000 might trip the limit. Is there a recommended value?There was a discussion [3] a few years ago where it seemed like it was settled on 6k for that particular case mostly because higher batch sizes were slower. Is that still the case now? I also saw that the batching implementation was replaced since then which means there isn't a need for increasing the grpc pool size since there isn't more than 1 outstanding RPC at a time when using the batch span processor, correct?
We have replaced the batch span processor internally with our own implementation to help improve the situation but I still believe the recommendations here are relevant for either one.
[1] https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/trace/batch_span_processor.go#L317 [2] https://cloud.google.com/trace/docs/quotas [3] #456