elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
41 stars 452 forks source link

Migrate uses of the logfile input to filestream #2518

Open kvch opened 2 years ago

kvch commented 2 years ago

Goal

The goal of this issue is to migrate existing packages that rely on log (logfile) input to filestream. Updating the package must be backwards compatible. The change in the integration package should be hidden from users.

The only user-visible change should be the value of input.type in the event from log to filestream.

Why migrate?

The new filestream input has replaced the good, old log a.k.a. logfile input in Beats. The filestream input is GA since 7.16 and at the same time logfile was deprecated. In the last few releases, we added numerous bug fixes to the new input, and now we are working on enhancements. It is stable enough for adoption in Integrations.

It comes with several improvements over the old input: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#filebeat-input-filestream

Differences

There are several differences in the configuration of the inputs:

How to migrate integrations?

Some of the changes might be automated, for example, renaming close_removed to close.on_state_change.removed. But some options require manual checking and adjustments e.g. the parsing of lines. Also, there are new options, like include_files the counterpart of exclude_files. Those should be validated to see if existing configurations could be improved.

General steps for migrating a package

How to migrate on Filebeat side?

If someone has been using e.g. Apache integrations and updates to the new version, input change must not be visible to users. Upgrading the package must not mean that the monitored files are read from the beginning. State information from the log input has to be passed to the filestream input, so it can continue where log input has left off. Given that position tracking is similar in the inputs, changing the state ID from log::{id}::{device}-{inode} to filestream::{id}::{device}-{inode} should work.

Tasks

Beats side migration

Packages

This is the list of packages that use logfile input to collect data.

elasticmachine commented 2 years ago

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

cmacknz commented 2 years ago

I've updated the list of integrations using the logfile input type based on usage as of today.

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

mbudge commented 1 year ago

Please make sure this is available in the custom logs integration

scan_frequency ignore_older close_inactive harvester_limit prospector.scanner.include_files