cherweg / logstash-input-s3-sns-sqs

logstash input downloading files from s3 Bucket by OjectKey from SNS/SQS
Other
29 stars 35 forks source link

2.0.8 - Gzip validator should return false for small files #31

Closed nickdw closed 5 years ago

nickdw commented 5 years ago

Hi,

Unless there is a situation I'm unaware of, I believe the gzip validator "valid?" method should return false for files too small to validate (< 2 bytes) - instead of throwing skip delete. I think it is reasonable to assume any file this small is not a gzip anyway, given that (based on my tests) when a 0 byte file is gzipped it increases to around 22 bytes.

This change should allow small files to be “processed” instead of recirculating in the queue.

Example of issue: I currently have AWS config sending logs to an S3 bucket, with this plugin reading from it. In this bucket, AWS Config frequently creates a 0 byte "ConfigWritabilityCheckFile" file.

In version 2.0.8 this results in throwing skip delete (as the file is <2 bytes). After the visibility timeout is reached the plugin attempts to re-process the same event again and again in an infinite loop, with the queue growing reasonably quickly as new events are created.

I attempted to make this change myself but ran into some issues. Here is the gist of what I was thinking (I'm not a ruby guy) https://gist.github.com/nickdw/ec2d97a626f7a5293d819858e6fb401c

Open to other suggestions

Thanks

nickdw commented 5 years ago

Managed to make the changes myself after realising the gem (v2.0.8) is ahead of this repo (v2.0.6) and adding the appropriate changes. So far it seems to be working as expected, however I'll test it for a bit longer.

christianherweg0807 commented 5 years ago

You are right...smale file should return false...

nickdw commented 5 years ago

Legend! Thanks for working on this plugin