This is the official description of these parameters in the documentation here:
send_batch_size (default = 8192): Number of spans, metric data points, or log records after which a batch will be sent regardless of the timeout. send_batch_size acts as a trigger and does not affect the size of the batch. If you need to enforce batch size limits sent to the next component in the pipeline see send_batch_max_size.
timeout (default = 200ms): Time duration after which a batch will be sent regardless of size. If set to zero, send_batch_size is ignored as data will be sent immediately, subject to only send_batch_max_size.
send_batch_max_size (default = 0): The upper limit of the batch size. 0 means no upper limit of the batch size. This property ensures that larger batches are split into smaller units. It must be greater than or equal to send_batch_size.
However, looking at the output files from the pipeline that uses this processor, I'm seeing a few anomalous things:
Overall, file sizes are still much below the send_batch_size, with most files containing ~80 records. I don't believe this is due to the timeout because I have increased the timeout from 10s - 15s with not much change in the most common file size.
I receive some super large files which have significantly more records than send_batch_max_size, i.e. 7935 records/file.
Steps to reproduce
Create a high volume pipeline that passes through a batch processor with the specifications above.
What did you expect to see?
All files should have <= 4096 records since that is the maximum size of batches.
Most files should be much larger than 80 records/file because of the high values of timeout and send_batch_size.
What did you see instead?
A low volume of extremely large files, with > 4096 records.
Most files only contain 80 records, and are much smaller than send_batch_size.
Describe the bug I have deployed
opentelemetry-collector
with the following settings:This is the official description of these parameters in the documentation here:
However, looking at the output files from the pipeline that uses this processor, I'm seeing a few anomalous things:
send_batch_size
, with most files containing ~80 records. I don't believe this is due to thetimeout
because I have increased the timeout from 10s - 15s with not much change in the most common file size.send_batch_max_size
, i.e. 7935 records/file.Steps to reproduce Create a high volume pipeline that passes through a batch processor with the specifications above.
What did you expect to see?
What did you see instead?
What version did you use? v0.104.0
What config did you use?