elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
108 stars 4.93k forks source link

Filebeat Run on Static Files #37014

Open bwhartlove opened 1 year ago

bwhartlove commented 1 year ago

Describe the enhancement: It would be nice if Filebeat had the ability to supply a flag that said 'harvest the files in your input, then shut down when they are completely read'.

Describe a specific use case for the enhancement or feature: Currently, I have a pipeline including Zeek. The Zeek container generates it's metadata logs and those logs are mounted into a Filebeat container that uses the beats module for Zeek to parse through the logs and outputs the data to Kafka. Once these files are mounted inside the Filebeat container, they do not change, so Filebeat really doesn't need to 'monitor' them, but just read them through once and shutdown. However, the only way I've found to have Filebeat 'shutdown' after reading the file is to supply a timeout variable that is a safe estimate of how long Filebeat will take to read and output the metadata.

I'd like to propose adding a flag such as --no-daemon that would indicate to Filebeat that it's inputs are static and once it has no more events to ship off to it's output, shutdown. Is this possible, and if not, is there another way to achieve the functionality I am looking for with Filebeat that I missed. Thanks!

botelastic[bot] commented 1 year ago

This issue doesn't have a Team:<team> label.

mdcrank commented 1 year ago

Commenting for visibility and tracking.

vinit-chauhan commented 1 year ago

Hey @bwhartlove - Have you tried --once flag with filebeat? If not, here's the Ref link.

Also, you have to provide close_eol in the config of log input. Ref link

Try it out this might work for your use-case.

bwhartlove commented 1 year ago

@vinit-chauhan Roger that, I'll give that a shot. Read through the references you sent and it sounds like exactly what I need. I'll test this out and report back with my results. Thanks!

bwhartlove commented 1 year ago

@vinit-chauhan I tried this out by using the --once flag with filebeat run and added the close_eof to my my modules section of my config. I'm trying to use the Zeek module specifically, and it appears that it only reads a small fraction of the log records then shuts down. For comparison, the last time I ran the pipeline, I got 348 records in elasticsearch, but the file should produce 74k logs. Any ideas on why this may be?

I found this post from a while back that seems to be having the same issue but was never resolved: https://discuss.elastic.co/t/filebeat-zeek-module-not-reading-all-events-with-once-option/299459

botelastic[bot] commented 1 week ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!