grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.29k stars 3.38k forks source link

Promtail crashing when using custom windows event channels #12492

Open bloodmc opened 5 months ago

bloodmc commented 5 months ago

Describe the bug Platform: Windows Server 2022 Promtail: 2.8.11

I'm using custom windows event channels from https://github.com/palantir/windows-event-forwarding/tree/master/windows-event-channels

Promtail randomly crashes as its pushing logs which seems to be related to the custom event channels I am using. The only way to get the promtail service started is to wipe out all the bookmark xml files. Note: Custom event logs are successfully being pushed into Loki but over time promtail stops working and crashes.

To Reproduce Steps to reproduce the behavior:

  1. Install custom WEC following https://github.com/palantir/windows-event-forwarding/tree/master/windows-event-channels
  2. Update scrape_configs to use custom event channels.
  3. Start Promtail (2.8.11) to tail '...'
  4. Wait some time and promtail will crash.

Expected behavior Promtail to not crash while pushing custom event logs to Loki.

Environment:

Screenshots, Promtail config, or terminal output

Here is the promtail.yml config

server:
  http_listen_port: 9080
  http_tls_config:
    cert_file: node_exporter.crt
    key_file: node_exporter.key
  grpc_listen_port: 0

positions:
  filename: C:\PROGRA~1\windows_exporter\promtail\positions.yaml

clients:
- url: https://loki.domain.com/loki/api/v1/push

scrape_configs:
- job_name: wec_account_lockout
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_wec_account_lockout.xml
    eventlog_name: "WEC3-Account-Management"
    labels:
      channel: WEC3-Account-Management
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: wec_authentication
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_wec_authentication.xml
    eventlog_name: "WEC-Authentication"
    labels:
      channel: WEC-Authentication
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: wec_powershell
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_wec_powershell.xml
    eventlog_name: "WEC-Powershell"
    labels:
      channel: WEC-Powershell
      os: windows
      job: eventlog
- job_name: wec_process_execution
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_wec_process_execution.xml
    eventlog_name: "WEC-Process-Execution"
    labels:
      channel: WEC-Process-Execution
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: wec_registry
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_wec_registry.xml
    eventlog_name: "WEC2-Registry"
    labels:
      channel: WEC2-Registry
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: winevent_app
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_app.xml
    eventlog_name: "Application"
    xpath_query: 'Event[System[(Level=1 or Level=2 or Level=3)]]'
    labels:
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: winevent_sys
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_sys.xml
    eventlog_name: "System"
    xpath_query: 'Event[System[(Level=1 or Level=2 or Level=3)]]'
    labels:
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: winevent_sec
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_sec.xml
    eventlog_name: "Security"
    # Collect "Audit Failure" only
    xpath_query: 'Event[System[Keywords="0x8010000000000000"]]'
    labels:
      os: windows
      job: eventlog
  relabel_configs:
    - source_labels: ['computer']
      target_label: 'host'
- job_name: winevent_taskscheduler
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_tasksched.xml
    eventlog_name: "Microsoft-Windows-TaskScheduler/Operational"
    xpath_query: 'Event[System[(Level=1 or Level=2 or Level=3)]]'
    labels:
      os: windows
      job: eventlog
- job_name: winevent_powershell
  windows_events:
    use_incoming_timestamp: true
    exclude_event_data: false
    exclude_user_data: false
    bookmark_path: C:\PROGRA~1\windows_exporter\promtail\bookmark_powershell.xml
    eventlog_name: "Microsoft-Windows-PowerShell/Operational"
    labels:
      os: windows
      job: eventlog

Here is the error when promtail crashes

https://gist.github.com/bloodmc/ac39021d6342da61a9f7f53acb93ed48

l-freund commented 4 months ago

Hello!

I can confirm that. We want to forward the logs from our Windows Server to a Windows loggingserver. From there, the logs should be sent to Loki using Promtail.

We have also used the custom WECs from https://github.com/palantir/windows-event-forwarding/tree/master/windows-event-channels, and Promtail crashes with the above-mentioned error message as soon as a forwarded event is being read.

However, this happens regardless of the eventchannel. It doesn't matter whether the forwarded logs are written to the custom WECs, the default ForwardedEvents Channel, or directly to the system log of the host. As soon as Promtail is confronted with an entry originating from a machine other than the host, it crashes.

l-freund commented 4 months ago

Switchted to the new Grafana Alloy. This works with forwarded events.