awslabs / aws-fluent-plugin-kinesis

Amazon Kinesis output plugin for Fluentd
Apache License 2.0
293 stars 96 forks source link

[Q] How to increase kinesis_firehose PutRecordBatch size ? #193

Open azman0101 opened 4 years ago

azman0101 commented 4 years ago

Our average PutBatchRecord size is 1Ko bytes.

D, [2020-01-14T10:16:30.981466 #1] DEBUG -- : [Aws::Firehose::Client 200 0.033437 0 retries] put_record_batch(delivery_stream_name:"kxxxGe" ... (1429 bytes)>}])

Regarding to AWS FAQ: https://aws.amazon.com/kinesis/data-firehose/faqs/#Pricing_and_billing

Q: When I use PutRecordBatch operation to send data to Amazon Kinesis Data Firehose, how is the 5KB roundup calculated?

The 5KB roundup is calculated at the record level rather than the API operation level. For example, if your PutRecordBatch call contains two 1KB records, the data volume from that call is metered as 10KB. (5KB per record)

We are not optimizing our costs.

I don't see which kinesis_firehose plugin option can help to increate PutRecordBatch size.

Regards,

JB

simukappu commented 4 years ago

We don't have an option now. It is not a matter of PutRecordBatch size, but matter of record aggregation because the 5KB roundup is calculated at the record level rather than the API operation level. We can use KPL format with Kinesis Data Firehose, but you need to send records to Kinesis Data Streams in front of Firehose. https://docs.aws.amazon.com/streams/latest/dev/kpl-with-firehose.html

simukappu commented 4 years ago

Closing this issue for now. Please reopen if required.

simukappu commented 4 years ago

Reopened for feature request like aggregatedRecordSizeBytes configuration in Kinesis Agent. https://docs.aws.amazon.com/firehose/latest/dev/writing-with-agents.html#agent-config-settings

mmerickel commented 4 years ago

I'd really like to see aggregated record support for firehose. The 5kb rounding for log messages causes a real waste in the common case and aggregated records could drop costs significantly.