grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.57k stars 3.41k forks source link

EventLogMessage generating spurious warnings when parsing Auditing log entries #9648

Open kelnage opened 1 year ago

kelnage commented 1 year ago

@wz2b raised an issue they identified when using the EventLogMessage parser on logs generated by the Cryptographic operations event log. The parser is generating warnings for each log entry, as the title of the event does not include a colon.

Based on my testing, the remainder of the message data should be parsed correctly - but the first line (Cryptographic operation.) is being dropped (due to this condition) and the warning may be causing issues.

@wb2z, have I described the issue you are encountering correctly? If so, what do you think the preferred behaviour should be in this case? Two potential (not mutually exclusive) solutions I can envisage would be:

wz2b commented 1 year ago

That is what I believe is the case. May I propose another solution, though, that might work better? If I understand your suggestion you would throw away the title, but what would be better would be to keep it. I like the way telegraf does thsi:

  # Process EventData XML to fields, if this node exists in Event XML
  # process_eventdata = true

  ## Get only first line of Message field. For most events first line is
  ## usually more than enough
  # only_first_line_of_message = true

This gives you the option of keeping the first line (what you called the "title") as the message, and getting the rest of the information from event_data. Near as I can tell event_data contains the same info that's in message but as lines of xml so it's more parsable. It doesn't need a full XML parser, the format looks simple - it is one line but I added newlines in my example for clarity:

<Data Name='NewProcessId'>0x2548</Data>
<Data Name='NewProcessName'>C:\\promtail\\promtail-windows-amd64.exe</Data>
<Data Name='TokenElevationType'\u003e%%1937</Data>
...

I already updated my fork of loki; I would be willing to fix this if you are willing to guide me a little (I have never contributed to loki or promtail) - if you want me to, and you don't have time. My change would be to give the ability to parse only the title out of Message, and to parse the remaining fields out of event_data.

kelnage commented 1 year ago

I'd definitely be interested in seeing your approach @wz2b (I tried looking at your loki fork but I wasn't able to identify the relevant branch sadly). I have thoughts on how to do what you're asking for (I believe what you're suggesting aligns nicely with my first proposed solution, probably also incorporating the second one too) - but if you're already gone ahead and done it, it might be easier to take it as is!

wz2b commented 1 year ago

I didn't create the branch yet, I was just planning out in my head how I would do it. But I can get started. I'm on loki-dev on slack if you want to chat about it, meanwhile I'm happy to take a stab at it myself if you're short on time.

kelnage commented 1 year ago

Oh, no worries - I misunderstood your comment about "I already updated my fork of loki"! :smile: Not a problem on my side - let me have a stab at it today and we can see if it meets your use case better.

kelnage commented 1 year ago

I've taken a stab at achieving similar functionality as you described in telegraf, adding a first_line_only argument , which extracts the first line into the fields with a key of Description. You can see my changes and unit tests here - any thoughts? I was thinking it would make sense to still want to extract the first line regardless going forward, so if this looks good so far, I'll work on that next.

wz2b commented 1 year ago

Extracting that to a field Description sounds good if there's not another one named Description. I think that's a good start. The parsing of event_data (the lines of in XML) is probably more important to me though. One of the things I'm trying to do this for is filesystem auditing, and in an Audit event the object (file) path is one of those fields :)

I am off site today but can check on all of this tonight.

kelnage commented 1 year ago

I have just updated my implementation to always pull the first line (if it isn't a valid key-value) into the Description field (or Description_extracted, if Description is already set), regardless of whether the first_line_only flag is set.

With regards to the event_data XML requirement: obviously right now, this parser extracts all its fields from the message field - and in all the examples of Windows event log data that I've seen, the content of that message field aligns exactly with the content of the event_data field (the only different being the format that data appears in). Do you have any examples where that isn't true?

Parsing the contents event_data is of course very doable; it would probably sit better in the windows_events scraper than in the eventlogmessage parser - but I wanted to understand better what was motivating this particular need.