elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.15k stars 4.91k forks source link

Add s3 output to filebeat #18158

Open thenewguy opened 4 years ago

thenewguy commented 4 years ago

It would be very helpful to allow filebeat to output to s3 directly.

Currently, if one wants to store logs on s3, logstash is required.

When using a service like aws elastic beanstalk, it is very handy to push logs to s3 for persistence. I can imagine 100 other usecases but this one is the current one that would have simplified my life.

The only alternative is rather complicated - you have to configure the aws ECS agent to support gelf logging and then use gelf logging to logstash. And then push the logs to s3 via logstash output.

Major downsite: There is a race-condition with this approach where you will lose the initial logs from containers that start before the logstash container.

One big downside: you cannot use docker logs for quick inspection anymore because aws doesn't offer dual logger output. The json-file supported by file beat would work out of the box here. It would certainly be easier to use filebeat when just getting started

Another issue is that you must run logstash on each application instance plus the ones you need for ingestion into elasticsearch.

elasticmachine commented 4 years ago

Pinging @elastic/integrations (Team:Integrations)

holisticode commented 4 years ago

Any statement from the maintainers? Will this be considered?

ktham commented 4 years ago

I'm hoping S3 Output for Filebeats gets considered soon. We are looking to replace fluentd with fluent-bit as soon as fluent-bit adds S3 output (https://github.com/fluent/fluent-bit/pull/2583), but I would prefer to stay within the Elastic ecosystem if possible.

(We want to be running something lightweight which is why we're looking to move off fluentd. And logstash is not lightweight)

lambda-9 commented 3 years ago

We would also like to consider various beats, and elastic agent/fleet management as a replacement for fluentd and fluent-bit. We are unable to consider beats or elastic agent due to lack of flexibility in outputs. We can use logstash as an output, but this fails in any scenario where elastic agent or the fleet manager is to be considered.

We do not consider Elasticsearch a primary data store for logs and events. Rather, it is a secondary data store for analysis and search. We must persist the primary data store for 7 years in most cases and we don't feel we can keep indices around that long on a reliable, performant, or cost-effective basis. It would be great to be able to output events to S3 as the primary data store and have logstash or something else read events from there, or to have beats output to both S3 and Elasticsearch simultaneously.

botelastic[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

thenewguy commented 2 years ago

bleh - auto robots with tags on real issues is obnoxious =/

bryanjacobsos commented 2 years ago

I'm building a pipeline for retrying failed messages when kafka or other components in our ingestion pipelines fail. Having a filebeat that can move data from log to s3 would be extremely helpful.

At this point my only option is fluentd.

ktham commented 2 years ago

At this point, we've given up on considering filebeats for S3 output and we've adopted fluent-bit (https://github.com/fluent/fluent-bit) for shipping logs to S3. (It's written in C, vs. Ruby in fluentd)

They've added support for S3 output over a year ago in https://github.com/fluent/fluent-bit/pull/2583, and it working quite well for us.

The project is very active and that would be my recommendation if you need S3 output.

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

ktham commented 1 year ago

No problem, we've migrated to Vector already in https://vector.dev/docs/reference/configuration/sinks/aws_s3/, so no need for this anymore 🙁

Z4ck404 commented 1 year ago

Having is extremely useful .. is there any plans to add s3/gcs/azure output to filebeat ?

botelastic[bot] commented 6 months ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

jakauppila commented 6 months ago

This would still be useful