Open mogrein opened 2 years ago
BTW I would appreciate any idea why we get kill syscalls in audit. It turnes out that systemd-journald-audit.socket wasn't stopped on machines with this crash, but upon masking it and reboot we still can reproduce an issue on test machine. There seems to be no other audit-subscrribers on the system
[...] It turnes out that systemd-journald-audit.socket wasn't stopped on machines with this crash, but upon masking it and reboot we still can reproduce an issue on test machine. There seems to be no other audit-subscrribers on the system
Thanks for this report and the relative PR.
In the issue you talk about having tested with audit_allow_kill_process_events=1
and audit_allow_kill_process_events=1
or audit_allow_kill_process_events=1
and audit_allow_kill_process_events=1
, but they are the same flag. What was the second flag?
BTW I would appreciate any idea why we get kill syscalls in audit.
If you're asking why it has been added to osquery, I would think that it's to see what might be killing critical processes?
Copy-paste mistake. I wanted to say audit_allow_process_events
and audit_allow_kill_process_events
I see; just as a clarification: I can see the problem when a kill syscall comes and it's missing the next record, but I wanted to reproduce and I was wondering if you were running with --audit_allow_config
.
If so and you use the --audit_allow_kill_process_events=0
, then osquery shouldn't install a kill
syscall rule and audit shouldn't generate any event for them.
Is that rule manually installed?
What are the rules in sudo auditctl -l
?
Something else I noticed (EDIT: that you actually mentioned in your PR!) is that the original PR did not actually expose the new columns needed for the kill sycall. Namely https://github.com/osquery/osquery/blob/3795ab0785c067fd09164fab8ddbd3a0d73c256c/osquery/tables/events/linux/process_events.cpp#L234-L239
Those columns aren't here https://github.com/osquery/osquery/blob/master/specs/posix/process_events.table.
Bug report
What operating system and version are you using?
version = 16.04.6 LTS (Xenial Xerus) build = platform = ubuntu
What version of osquery are you using?
version = 4.8.0.0-yandex But the crash is reproducable up to 5.0.1
What steps did you take to reproduce the issue?
Running osquery with
--audit_allow_process_events=1
and--audit_allow_kill_process_events=0
and parsing kill syscall in audit log.What did you expect to see?
Osquery works
What did you see instead?
It crashes with the following stacktrace SIGSEGV
Further inverstigation of crashdumps and osquery with
--audit_debug=1
showed us that for some reason osquery gets syscall=62 event from audit without the following AUDIT_OBJ_PID record.Here is snippet with
--audit_allow_kill_process_events=1
and--audit_allow_kill_process_events=1
As you can see when
audit_allow_kill_process_events
enabled, syscall=62 is followed by 1318 record.But when osquery launched with
--audit_allow_process_events=1
and--audit_allow_kill_process_events=0
we get the following logAs you can see the related 1318 is absent in log. Process events subscriber doesn't handle the situation with AUDIT_OBJ_PID absent and osquery crashes trying to dereference null. https://github.com/osquery/osquery/blob/3795ab0785c067fd09164fab8ddbd3a0d73c256c/osquery/tables/events/linux/process_events.cpp#L232