Closed mostfunkyduck closed 3 years ago
Thank you for your feedback. Let me confirm my understanding.
When we turn on 'append_new_line',
Is this correct?
I'm seeing a '¥n' where I'd expect to see '\n' in your response, otherwise this is most of the issue. The other part is that there needs to be some kind of delimiter between the zlib strings, zlib doesn't automatically concatenate. Maybe switch to gzip, which does, and be done with it?
Does anyone have the same problem?
Closing this issue for now. Please reopen if required.
I have a few problems with the firehose plugin's compression that I was hoping y'all could shed some light on. My use case is that I'm tailing a file of json data, batching events, then pushing them to s3 via firehose.
With the default settings, each compressed stream is newline delimited in the resulting s3 bucket. The problem is that when I compress strings such as
{"a":1}
, the resulting compressed data contains a newline, meaning that there's no way to differentiate records based on a newline alone. If I turn off 'append_new_line', then you end up with a bunch of compressed streams with NO delimiter, which isn't a valid way to concatenate zlib archives, so that fails as well.The compression algorithm ends up compressing each line of the log file individually. This results in much less efficient compression, especially for small records. It should compress batches of records instead.
Please let me know if there's anything more you need or if you want me to break this into two tickets.