Open ZerNox opened 2 months ago
This issue doesn't have a Team:<team>
label.
We're experiencing the same issue since upgrading to 8.14.x. I've narrowed it down to the problem being introduced in 8.14.0. The duplicates did not happen on 8.13.4. I suspect #36761 and #38488 to be the cause, but I couldn't confirm it yet.
The duplicates happen for us if we have a logfile which isn't updated within the close_inactive
timeout. By using the settings from Inode reuse causes Filebeat to skip lines we could work around the issue and it's not happening anymore, even with 8.14.x
Our default config which causes duplicates on 8.14.x:
---
- type: filestream
id: cassandra_system
paths:
- /var/log/cassandra/system.log
encoding: plain
document_type: log
prospector:
scanner:
check_interval: 10s
harvester_buffer_size: 16384
message_max_bytes: 10485760
parsers:
- multiline:
pattern: '^[[:space:]]*[A-Z]+[[:space:]]+\['
negate: true
match: after
tail_files: false
backoff.init: 1s
backoff.max: 10s
backoff_factor: 2
close_inactive: 5m
close_renamed: false
close_removed: true
close_eof: false
clean_inactive: 0
clean_removed: true
close_timeout: 0
fields:
type: cassandra
fields_under_root: true
New config working with 8.14.x:
---
- type: filestream
id: cassandra_system
paths:
- /var/log/cassandra/system.log
encoding: plain
ignore_older: 48h
document_type: log
prospector:
scanner:
check_interval: 10s
harvester_buffer_size: 16384
message_max_bytes: 10485760
parsers:
- multiline:
pattern: '^[[:space:]]*[A-Z]+[[:space:]]+\['
negate: true
match: after
tail_files: false
backoff.init: 1s
backoff.max: 10s
backoff_factor: 2
close_inactive: 10m
close_renamed: false
close_removed: true
close_eof: false
clean_inactive: 72h
clean_removed: true
close_timeout: 0
fields:
type: cassandra
fields_under_root: true
So the differences between the two configs are:
ignore_older: 48h (vs. not set)
close_inactive: 10m (vs. 5m)
clean_inactive: 72h (vs. 0)
Most of these settings are default values coming from pcfens/filebeat and after playing around with the settings only clean_inactive
seems to affect the issue. If I set it to anything other than 0, there are no duplicates. If clean_inactive: 0
is set, the duplicates start coming.
Some more investigation revealed that setting clean_inactive: 0
in 8.14.x has the effect of nulling out the cursor and meta fields in the registry file (/var/lib/filebeat/filebeat/log.json
):
clean_inactive: 0
in 8.13.x:
{"op":"set","id":65254}
{"k":"filestream::cassandra_system::native::262312-2051","v":{"ttl":1800000000000,"updated":[516033235298,1726832301],"cursor":{"offset":2052210},"meta":{"source":"/var/log/cassandra/system.log","identifier_name":"native"}}}
{"op":"set","id":65255}
{"k":"filestream::cassandra_system::native::262312-2051","v":{"ttl":1800000000000,"updated":[516035577754,1726832336],"cursor":{"offset":2052383},"meta":{"source":"/var/log/cassandra/system.log","identifier_name":"native"}}}
clean_inactive: 0
in 8.14.x:
{"op":"set","id":65260}
{"k":"filestream::cassandra_system::native::262312-2051","v":{"cursor":null,"meta":null,"ttl":0,"updated":[515468346993,1726834065]}}
{"op":"set","id":65261}
{"k":"filestream::cassandra_system::native::262312-2051","v":{"ttl":0,"updated":[515468400457,1726834965],"cursor":null,"meta":null}}
Shouldn't the cursor.offset
(and meta
) have a value, even if clean_inactive
is zero?
@rlueckl 🙏 we will try this aswell. Have anyone considered moving to fingerprint to avoid these inode issues ? And is the performance hit still is Valid https://www.elastic.co/blog/introducing-filestream-fingerprint-mode
Thanks for the link, I didn't know about fingerprint mode.
In our particular case there are no performance issues to consider since we're only collecting one file per node (/var/log/cassandra/system.log
) and the <inode>-<device_id>
looks stable from what I can tell from log.json
. There's a new "inode" every day when the file gets rotated and the "device_id" is unchanged. So the <inode>-<device_id>
stays the same during the day and is unique every day.
Is this issue confirmed from the project aswell ?
I'm seeing similar issue with duplicate log entries as well since moving to 8.14.3. Did not have this issue in 8.12.2.
I run on Amazon Linux 2.
I have this config below with fingerprint configured, as we had issues before using filestream with Java SpringBoot logs, and inode reusage.
- type: filestream
id: json-logs
tags: ["filebeat", "json"]
encoding: utf-8
ignore_older: 30m
paths:
- "/var/opt/logs/*/*-json.log*"
file_identity.fingerprint: ~
prospector.scanner:
exclude_files: ['\.gz$']
fingerprint:
enabled: true
offset: 0
length: 1024
This should be fixed in 8.15.0 according to https://github.com/elastic/beats/pull/40258
We have upgraded from 8.12.2 to first 8.14.1 and then to 8.14.3 but on both 8.14.1 and 8.14.3 we have issues with filebeat, that we think occurs in the filestream reader. No duplications on 8.12.2 if we revert the version and run the same load test.
Is there anything obvious that has changed in terms of the configuration between the versions that i have missed, or is there something that we should modify in our configuration?
For confirmed bugs, please report: