zmoog / public-notes

Apache License 2.0
0 stars 1 forks source link

Figure out how the Filebeat registry stores entries #57

Open zmoog opened 8 months ago

zmoog commented 8 months ago

I am investigating a user problem with Filebeat re-processing the same S3 objects file after a restart.

I suspect this may happen because it didn't track properly the state for each S3 object, so I want to learn how and where the registry stores its content.

zmoog commented 8 months ago

On a 8.8.2 cluster:

elastic-package build && elastic-package stack up -d -v --version 8.8.2

I set up an agent to collect CloudTrail logs from an S3 bucket. The aws-s3 is configured for polling the bucket.

Here's the agent policy:

  - id: aws-s3-cloudtrail-a10aca5c-590b-4643-ad4e-48c6fb65ddbf
    name: aws-14
    revision: 3
    type: aws-s3
    use_output: default
    meta:
      package:
        name: aws
        version: 1.51.1
    data_stream:
      namespace: default
    package_policy_id: a10aca5c-590b-4643-ad4e-48c6fb65ddbf
    streams:
      - id: aws-s3-aws.cloudtrail-a10aca5c-590b-4643-ad4e-48c6fb65ddbf
        data_stream:
          dataset: aws.cloudtrail
          type: logs
        file_selectors:
          - regex: /CloudTrail/
            expand_event_list_from_field: Records
          - regex: /CloudTrail-Digest/
          - regex: /CloudTrail-Insight/
            expand_event_list_from_field: Records
        access_key_id: <REDACTED>
        content_type: application/json
        expand_event_list_from_field: Records
        secret_access_key: <REDACTED>
        max_number_of_messages: 5
        tags:
          - forwarded
          - aws-cloudtrail
        publisher_pipeline.disable_host: true
zmoog commented 8 months ago

Executing a shell in the agent container:

docker exec -u 0 -it elastic-package-stack-elastic-agent-1 /bin/bash

And install some basic tools:

apt install jq tree
zmoog commented 8 months ago

Search and inspect the registry persistence store on the file system:


# search for registry folder

$ find . -iname registry
./state/data/run/filestream-monitoring/registry
./state/data/run/aws-s3-default/registry
./.node/node/lib/node_modules/@elastic/synthetics/node_modules/playwright-core/lib/server/registry

# inspecting ./state/data/run/aws-s3-default/registry
$ tree ./state/data/run/aws-s3-default/registry
./state/data/run/aws-s3-default/registry
`-- filebeat
    |-- 1332300.json
    |-- active.dat
    |-- log.json
    `-- meta.json

1 directory, 4 files

$ du -sh ./state/data/run/aws-s3-default/registry/filebeat/*
22M ./state/data/run/aws-s3-default/registry/filebeat/1332300.json
4.0K    ./state/data/run/aws-s3-default/registry/filebeat/active.dat
7.9M    ./state/data/run/aws-s3-default/registry/filebeat/log.json
4.0K    ./state/data/run/aws-s3-default/registry/filebeat/meta.json

# what's inside the bigger file?
$ cat ./state/data/run/aws-s3-default/registry/filebeat/1332300.json | jq | more
[
  {
    "_key": "filebeat::aws-s3::state::<REDACTED>",
    "id": "<REDACTED>",
    "bucket": "<REDACTED>2",
    "key": "AWSLogs/<REDACTED>/CloudTrail/<REDACTED>",
    "etag": "\"b8570636942919ae3b7c0c693c78ceee\"",
    "last_modified": [
      281470681743360,
      1696581008
    ],
    "list_prefix": "",
    "stored": true,
    "error": false
  },

# How many elements are there in the list?
$ cat ./state/data/run/aws-s3-default/registry/filebeat/1332300.json | jq '. | length'
25092

# So we have 25092 keys in the registry, probably one entry for each S3 object processed by Filebeat
zmoog commented 8 months ago

Annotations:

The aws-s3-default/registry/filebeat/log.json file is updated regularly.

If I run:

$ tail -f ./state/data/run/aws-s3-default/registry/filebeat/log.json
{"op":"set","id":1344226}
{"k":"filebeat::aws-s3::writeCommit::<REDACTED>-aws-cloudtrail-logs-<REDACTED>","v":{"time":[281470681743360,1697011044]}}

I can see new content added regularly.

The file 1332300.json last update was three hours ago.

$ ls -ltr ./state/data/run/aws-s3-default/registry/filebeat/
total 31896
-rw------- 1 elastic-agent elastic-agent       15 Oct 10 23:25 meta.json
-rw------- 1 elastic-agent elastic-agent 22393719 Oct 11 05:13 1332300.json
-rw------- 1 elastic-agent elastic-agent       85 Oct 11 05:13 active.dat
-rw------- 1 elastic-agent elastic-agent 10244757 Oct 11 08:02 log.json

And here's the content of active.dat:

$ cat ./state/data/run/aws-s3-default/registry/filebeat/active.dat
/usr/share/elastic-agent/state/data/run/aws-s3-default/registry/filebeat/1332300.json