fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.85k stars 1.58k forks source link

Duplicate events get ingested in winevtlog input plugin for fluent-bit 3.0.2. #8747

Open Hardik-Parikh opened 6 months ago

Hardik-Parikh commented 6 months ago

Bug Report

Describe the bug

To Reproduce

  1. Configure an winevtlog input plugin.
  2. Make sure to have multiple channels configured in the config.
  3. Also, provide the value of an input query in which multiple channels are used for filtering the events.
  4. For example:
    - name: winevtlog
    tag: some-tag
    alias: WIndows alias
    storage.type: filesystem
    channels: application,security,system
    interval_sec: 5
    read_existing_events: true
    db: C:\Program Files\checkpoint.db
    render_event_as_xml: true
    read_limit_per_cycle: 2m
    event_query: <QueryList><Query Id="0" Path="Application"><Select Path="Application">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select><Select Path="Security">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select><Select Path="System">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select></Query></QueryList>

    Expected behavior

    • There should be no duplication of events.

Screenshots

Your Environment

cosmo0920 commented 6 months ago

I reproduce your issue and I found a workaround for this case:

pipeline:
  inputs:
    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: application
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="Application">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: security
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="Security">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: system
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="System">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

Meanwhile it needs to define the event_query per channels. This is because the bookmark will be forcibly restored the information which needs to subscribe channels. This shouldn't be expected behavior. So, defining one-by-one style shouldn't mixed up the conditions which should filter and collect Windows EventLogs.

harshnasitcrest commented 6 months ago

If the channels are accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?

Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?

cosmo0920 commented 6 months ago

If the channels are accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?

Ideally, it's correct. However, Fluent Bit does not have the capability for now.

Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?

TBH, I have never heard that struggling things because QueryList with XML representation should be difficult to put in Fluent Bit configurations. Many of users should use easier configurations than yours.