Open adriansr opened 2 years ago
Pinging @elastic/security-external-integrations (Team:Security-External Integrations)
Keep in mind that auditd records can be interleaved ~and out of order~. The auditd's auparse has a good description of the behavior at https://github.com/linux-audit/audit-userspace/blob/c19119fe4bf25d3e755196cfef908f4c160dd7a7/auparse/internal.h#L36-L84.
You could use Filebeat multiline as it exists today (e.g. trigger the start with type=SYSCALL
), but there are edge cases that multiline cannot handle properly which would result in incomplete groupings or grouping where the sequence numbers are not all the same. The latter could be mitigated in post-processing by rejecting messages that don't match the sequence, but that means you are dropping some messages.
@andrewkroh the linked docs seem to contradict this claim "can be [...] out of order."
The auditd system does guarantee that the records that make up an event will appear in order
My idea was that if the following conditions are met:
... then events can be coalesced with a Beats script processor that observes the source log line by line.
Anyway, I have analyzed the module's test logs and all of them are correctly ordered and I don't observe any interleaving. I understand they are already ordered by auditd/auparse lib? Am I missing something?
It can be determined that an event is complete when the last record for that event is observed. (no need for timeouts/expiration).
If you we can reliably detect the last record in a sequence then that could be used trigger when to flush the event. One of the end of sequence markers mentioned is AUDIT_EOE, and the audit.log that Filebeat consumes doesn't actually contain this message (source ref). Without that EOE message, I think this will be a challenge to accomplish without timeouts.
And as you know, Beat processors are only driven by incoming events so flushing on a timeout could only be triggered by a new event. And since beat processors can only output a single event it would be a problem deciding on outputting the timed-out event or the newly received event...
I like the idea, but I think the edge cases cannot be overcome with the Beat processor model.
I think we should consider implementing the Filebeat parser interface, with the reassembler code from go-libaudit. Basically seeing if https://github.com/elastic/beats/issues/6484 can be added as a Filebeat parser. The benefit being that we can converge the parsing logic of the Filebeat audit module, Fleet auditd log integration, and Auditbeat.
Hi! We just realized that we haven't looked into this issue in a while. We're sorry!
We're labeling this issue as Stale
to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1
.
Thank you for your contribution!
This feature would be much appreciated.
Pinging @elastic/sec-linux-platform (Team:Security-Linux Platform)
Filebeat's auditd module consumes logs from Linux auditd daemon. Unlike Auditbeat auditd, which consumes data directly from the kernel, Filebeat's auditd module cannot correlate multiple log lines into a single event, resulting in some audit events being split into multiple documents, which makes it harder to craft queries over this data.
For example, a single
execve
event is composed of the following lines:Ideally, it should result in a single document enriched with the different PATHs and process information.
We should investigate a method to aggregate the different log lines and see if it's possible to update the ingest pipeline to parse the new multiline events.