lukewaite / logstash-input-cloudwatch-logs

Input plugin for Logstash to stream events from CloudWatch Logs
Other
142 stars 74 forks source link

Should use SinceDB time to set start_time parameter for get_log_events call #10

Closed brandond closed 7 years ago

brandond commented 8 years ago

Rather than re-listing the entire stream content and checking if each event is newer than last_read, you should pass the last_read timestamp into the get_log_events call and let AWS do the filtering for you server-side. Doing the filtering client-side is incredibly wasteful, since you have to re-read and discard ALL the old events every polling interval.

lukewaite commented 7 years ago

I took a look at this last night, and you're right, we definitely should be passing that in. I'm not sure if it was overlooked when first built, or if it's a param that wasn't available on that sdk method. Either way, we should definitely optimize.

In perusing the rest of the calls being made, I think there are a few other spots to be cleaned up. I'm going to start working on a bit of a rewrite to optimize the calls being made, but in the meantime I'd welcome any PRs fixing the issues in isolation as well.

lukewaite commented 7 years ago

I've just released v1.0.0.pre, a pre-release which heavily refactors the ingestion process and resolves this issue.

https://github.com/lukewaite/logstash-input-cloudwatch-logs/releases/tag/v1.0.0.pre