Open SolitudePy opened 10 months ago
Without sample data, there is not much I can do.
What version of Laurel are you using? Did laurel log anything unusual to syslog?
@SolitudePy Can you provide data or instructions on how to reproduce the issue?
@SolitudePy Incidentally, I stumbled upon a bug today that affected EXECVE events for very long command lines (> 2^16 arguments). (This has been fixed in d89c80cbcd12d88278ba1f99e4ad87358d55422e.) Does this look llike the symptom you observed?
@hillu Hello, we are using Laurel v0.5.3, I did not see anything peculiar that laurel logged. The command line wasnt that long for sure. also, from what I experienced the EXECVE field was totally dropped from the laurel log even though the SYSCALL.syscall is indeed EXECVE. I am not sure how you can produce that yourself, but you could try ingesting a lot of logs to a solution like Splunk and then search where SYSCALL.syscall equals to execve but and EXECVE is null for example
@SolitudePy Does Laurel or auditd log anything strange or meaningful around the time where you are missing data in the log?
Yes, I forgot to mention but we checked on multiple servers and it seems the correlated event was from auditd: dispatch err (pipe full) event lost
dispatch err (pipe full) event lost
This basically means that auditd (or audispd if you are using auditd < 3.0) is trying to write lines faster than Laurel consumes them.
The file descriptor that gets passed to Laurel as STDIN is actually one end of an AF_LOCAL socket so there's an associated buffer whose size can be increased (SO_SNDBUF
). IIRC, there's no setting in auditd, though.
Reducing the number of events generated using a tweaked audit ruleset should help.
@hillu yes I thought so. its quite surprising flood of events cause the dispatcher to miss full lines of EXECVE and therefore have laurel miss it. also, as I stated before our ruleset is quite basic and we planned to make it more verbose, it will be sad if laurel could not handle it, since the original audit.log does log all of the events :\
I'm sorry; as far as I know there isn't anything laurel can do here until we put reading rom input into a separate thread.
If we do the equivalent of a
setsockopt(fd, SO_SNDBUF, newsize, sizeof(newsize))
on Laurel's stdin, this should change the size of the wrong buffer. According to unix(7)…
The SO_SNDBUF socket option does have an effect for UNIX domain sockets, but the SO_RCVBUF option does not.
Do you think you might be able to run a patched version of auditd?
No, I'm sorry, are you saying there cant be a fix in laurel? also if that speculation is correct I should see more events per second in that gap rather than servers that do not have this bug, right?
are you saying there cant be a fix in laurel?
Not quite. The communication between auditd and laurel is buffered – and the cause of lines getting lost is most likely intermittent bursts of lines and overflowing that buffer before Laurel can catch up. The natural solution would be increasing the size of that buffer, but that can only be done on the sending side, i.e. not by Laurel.
Another solution would be to switch input handling on Laurel's side to a separate thread. I am open to pursuing this path, but this won't be done by the end of the week and I'd need to rely on you to test stuff for me.
We don't observe this problem frequently enough that we consider it an enormous problem.
Can you give me ballpark numbers about the number of events (unique message IDs) per second? What kind of hardware are you running on?
No, I'm sorry, are you saying there cant be a fix in laurel? also if that speculation is correct I should see more events per second in that gap rather than servers that do not have this bug, right?
Yes, pretty much. Another explanation would be that something is slowing down Laurel in processing or writing its log files considerably.
@hillu we are also seeing selinux msgs about laurel trying to get rpm info for files for many random files for example, it doesnt seem to affect laurel though... I will be able to give you the exact numbers next week
we are also seeing selinux msgs about laurel trying to get rpm info for files for many random files
Those are AVC messages, right? It would be really helpful if you could post some of those.
yes they appear in avc and also selinux troubleshoot, I will post them next week
Hello @hillu we checked an option to change q_depth of audispd (rhel 7) and it might fix the error of pipe full, but we afterwards still encountered logs that laurel has with syscall.syscall = execve and execve record does not exist. About SELinux: It has a lot of errors we logged on permissive, some of them were:
denied write
denied unlink
denied sys_ptrace for /proc/
In general, it seems laurel is working only if its selinux type is permissive.
oh… are you not using the SELinux policy from contrib/selinux
?
Regarding q_depth
and other settings … I think that I found a way to add an I/O threat that may fix the problem, but I'd need somebody to test that before releasing it. Could you do that?
Iil come back to you with an answer, regarding q_depth doesnt it fix the buffer size you mentioned before?
I am using the selinux policy in the git, the permissive type is included there with a comment of removing it only if there are no avcs
regarding q_depth doesnt it fix the buffer size you mentioned before?
Apparently, q_depth means that messages are buffered in user-space. Yes, this should help!
Hi, while doing our work we noticed probably a minor bug in Laurel that on some events it generates a json without the EXECVE/PROCTITLE key. We checked /var/log/audit and filtered based on
msg
, and we saw multiple events for EXECVE(SYSCALL,EXECVE,PROCTITLE,CWD etc) then we checked the matched laurel json event(/var/log/laurel based onID
) and it only had aSYSCALL
key, missing the EXECVE key. We checked and it happens on multiple servers, without any correlation to event sizing/high buffering. Our current Auditd configuration is not verbose for the other syscall types so we only encountered that forEXECVE
.I can't have a sample of the events that have this bug. Would like if you could help in some way, and I will help as much as I can, Thanks!