Closed brandond closed 7 years ago
I took a look at this last night, and you're right, we definitely should be passing that in. I'm not sure if it was overlooked when first built, or if it's a param that wasn't available on that sdk method. Either way, we should definitely optimize.
In perusing the rest of the calls being made, I think there are a few other spots to be cleaned up. I'm going to start working on a bit of a rewrite to optimize the calls being made, but in the meantime I'd welcome any PRs fixing the issues in isolation as well.
I've just released v1.0.0.pre
, a pre-release which heavily refactors the ingestion process and resolves this issue.
https://github.com/lukewaite/logstash-input-cloudwatch-logs/releases/tag/v1.0.0.pre
Rather than re-listing the entire stream content and checking if each event is newer than last_read, you should pass the last_read timestamp into the get_log_events call and let AWS do the filtering for you server-side. Doing the filtering client-side is incredibly wasteful, since you have to re-read and discard ALL the old events every polling interval.