vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.36k stars 1.51k forks source link

New `aws_cloudwatch_logs` source #2695

Open binarylogic opened 4 years ago

binarylogic commented 4 years ago

The goal of this source is to be a simple way to start consuming logs from AWS in low volume environments by pulling logs from exposed AWS CloudWatch APIs (FilterLogEvents and GetLogEvents).

There will be a relatively low volume limit to this source (< 5 MBps / 50k events in the case of FilterLogEvents) which we should document and take extra care around surfacing when we are hitting AWS rate limits.

We've also observed that FilterLogEvents does not perform well for groups with a large number of streams. If we go with this API, we should document this restriction (and provide knobs for specificity specific streams or stream prefix).

I'd probably recommend an RFC just for this source to suss out any decision points and to provide the answers to open questions, including:

Examples of other implementations:

External refs:

jszwedko commented 4 years ago

@binarylogic Rather than creating a new issue, I updated this issue description to be for the naive aws_cloudwatch_logs source. I'm not sure about how we want to prioritize it vs. the AWS Firehose -> Vector source (https://github.com/timberio/vector/issues/3566)

trevorstr commented 9 months ago

We've got logs being emitted to CloudWatch Logs from other AWS services (ie. Lambda). We would like to move these logs to Grafana Loki. It looks like Amazon CloudWatch Logs isn't currently supported as a data source.

jszwedko commented 9 months ago

The currently recommended approach for CloudWatch Logs is to use Firehose to forward them.