vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.03k stars 1.59k forks source link

Ingest existing object in AWS_S3 source #15609

Closed ambroserb3 closed 1 year ago

ambroserb3 commented 1 year ago

A note for the community

Use Cases

Vector is a great alternative to logstash in ELK stack. But there doesn't seem to be anyway to send existing data from an s3 bucket to Elasticsearch with vector, making migration harder.

This wouldn't just benefit an EVK stack, there are multiple use patterns where it would be useful to ingest existing s3 objects.

Attempted Solutions

I haven't tried this yet, but I was thinking I could just tag the existing objects in s3 and change my SQS rules to include created tags.

Proposal

No response

References

No response

Version

vector 0.26.0

jszwedko commented 1 year ago

Hi @stephanrb3 !

Thanks for this feature request! We actually have a parent issue where we are collecting ETL-like use-cases like this one, https://github.com/vectordotdev/vector/issues/11095, so I'll close this one and link it there. Feel free to add any additional details there.

As a workaround, you could publish SQS messages in the same format as the bucket notifications to trigger Vector's reading of the S3 files.