grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.78k stars 3.43k forks source link

Unable to parse Windows Server 2022 because of extra line #9643

Open wz2b opened 1 year ago

wz2b commented 1 year ago

This problem is related to the eventlogmessage pipeline stage. This is at the current release version, v2.8.2. I have a failure occurring here:

https://github.com/grafana/loki/blob/f43dd58459239057cf8ca0dbe53b5e7ef89c7ae9/clients/pkg/logentry/stages/eventlogmessage.go#LL105C3-L105C40

The 'Message' field is not formatted the way this pipeline stage is expected, exactly, because the message field begins with a string that's not a KVP: Here is an example of a 'message' field

The state of a transaction has changed.

Subject:
    Security ID:        SYSTEM
    Account Name:       GIS-XXXXXXXX$
    Account Domain:     AD
    Logon ID:       0x3E7

Transaction Information:
    RM Transaction ID:  {30ca6707-0477-11ee-8a9e-xxxxxxxxxx}
    New State:      48
    Resource Manager:   {ff849118-b891-11ed-b4c1-xxxxxxxxxx}

Process Information:
    Process ID:     0xb54
    Process Name:       C:\Program Files\xxxx

It's possible that I'm feeding the data in the wrong way. My scrape config is:

- job_name: windows-security
  windows_events:
    use_incoming_timestamp: true
    bookmark_path: "/promtail/bookmark-security.xml"
    eventlog_name: "Security"
    xpath_query: '*'
    labels:
      job: windows
      log: security
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'

  pipeline_stages:
  - json:
      expressions:
        message:
        level: levelText
  - eventlogmessage:
      source: message
      overwrite_existing: true

My thought on this is that there must be some difference because of the windows version. A better thing to parse might be event_data which is XML-like. It doesn't contain a complete XML document, it's more like JSONL where there are separate little XML snippets:

<Data Name='SubjectUserSid'\u003eS-1-5-18</Data>
<Data Name='SubjectUserName'>GIS-XXXX$</Data>
<Data Name='SubjectDomainName'>AD</Data>
...

I pasted that as if it had a \n or \r\n between lines but it does not. Promtail also is doing some kind of unicode substution on this content, which is another problem. Still, it seems like parsing those lines might be more promising than trying to parse the message field.

Any thoughts on all this?

wz2b commented 1 year ago

This is what we need: https://github.com/influxdata/telegraf/blob/6d1da80ebb543b039f7572d4c539a37b535685cd/plugins/inputs/win_eventlog/util.go#L92

I think some of it is actually there but commented out: https://github.com/grafana/loki/blob/f43dd58459239057cf8ca0dbe53b5e7ef89c7ae9/clients/pkg/promtail/targets/windows/win_eventlog/win_eventlog.go#L274

wz2b commented 1 year ago

Should these changes actually be put into https://github.com/grafana/agent/blob/main/component/loki/source/windowsevent and forget directly doing it in loki/promtail?