logstash-plugins / logstash-input-s3

Apache License 2.0
57 stars 152 forks source link

gzip: add config for reading whole gzip chunks ("speed"-mode) #126

Closed yaauie closed 6 years ago

yaauie commented 6 years ago

Ruby's Zlib::GZipReader#each_line is notoriously slow, so while it is capable of maintaining a steady memory profile by "streaming" one line at a time, in practice the overhead is painful.

As a stop-gap measure, this PR intoduces a new config parameter gzip_prefer, whose default value is memory, but can be set to speed.

This should alleviate the pain of the following tickets:

Ideally, we could be using a combination of java.util.zip.GZipInputStream and java.io.BufferedReader to do the bulk of the work on that Java-side of JRuby, but that work would present more risk and require more extensive testing.