Buffer messages before calling AWS.Firehose.putRecord

pkallos / winston-firehose

NodeJS module, winston logging transport which writes to AWS Firehose.

MIT License

13 stars 9 forks source link

Buffer messages before calling AWS.Firehose.putRecord #13

Open stefanodefaria opened 6 years ago

stefanodefaria commented 6 years ago

The problem: From AWS documentation:

Pricing is based on volume of data ingested into Amazon Kinesis Data Firehose, which is calculated as the number of data records you send to the service, times the size of each record rounded up to the nearest 5KB

Log messages are frequently smaller than 5kb. Therefore, we end up paying for more than we are consuming.

Proposition: Buffer up log messages in a way that AWS.Firehose.putRecord is only called with payload size is compatible with the 5kb pricing rule. Also, add a timeout that, when triggered, flushes the buffer to Firehose regardless of the payload size to be sent. This buffering mechanism (and its timeout vlaue) should be configurable by in the 'options' object when constructing the transport.

. Please let me know if you agree with the proposition, and if you need someone to code it (I could help with that).

rash805115 commented 6 years ago

👍 +1 This feature makes sense to me.

pkallos commented 5 years ago

good point makes sense

harsh-sri commented 2 years ago

very good point. Have we implemented this already?

harsh-sri commented 2 years ago

@pkallos maybe i can help in the implementation.

pkallos commented 2 years ago

very good point. Have we implemented this already?

This has not been implemented yet, contributions are appreciated!

pkallos commented 1 year ago

Updating here, I will work an optional implementation into v4.x, an upcoming release.

Some considerations:

currently the submissions to Firehose are unacknowledged, fire and forget so they don't block any threads
when buffering messages there is a consideration for flushing the buffer when the process is exited or the stream is destroyed, it seems winston provides a close() hook into transports so I will use that to flush the buffer.

Proposed additional options would be

interface BufferingOptions {
  bufferSize?: number, // buffer in number of messages
  bufferSizeKb?: number, // buffer hint in KB
  flushTimeout: number = 30_000, // buffer flush interval limit, in ms
}

witchpixels commented 2 months ago

Hey there, @pkallos!

I have a client who is using this for some of their services and I flagged this as a cost optimization, do you mind if I take this on?

Looking at the repository it looks like we need to handle close, as you noted, and I was thinking we'd just have the existing sender and a new buffered sender when your BufferingOptions struct is supplied.

Are there any other implementation considerations I should take into account?

pkallos commented 2 months ago

Hey there, @pkallos!

I have a client who is using this for some of their services and I flagged this as a cost optimization, do you mind if I take this on?

Looking at the repository it looks like we need to handle close, as you noted, and I was thinking we'd just have the existing sender and a new buffered sender when your BufferingOptions struct is supplied.

Are there any other implementation considerations I should take into account?

@witchpixels Thanks, yeah would be great if you could take this on. No other considerations other than those mentioned re: buffer flushing above.