elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.15k stars 4.91k forks source link

filebeat.config.inputs to load external configuration produces error #34613

Open FireBurn opened 1 year ago

FireBurn commented 1 year ago

We're migrating from filebeat 6.8 to 7.17 and we're finding that using:

filebeat.config.inputs:
  enabled: true
  path: conf.d/*.yml

conf.d/httpd.yml

- type: filestream
  id: "httpd-request-log"
  paths:
    - /apps/was/httpd/logs/request.log

  fields:
    type: httpd-logs
    grouping_name: httpd-grouping
    platform: prd
    environment: prd
    workgroup: platform-crl
    workload: "httpd"

  tail_files: false
  ignore_older: 24h

This gives the following log:

2023-02-20T10:50:03.613Z        INFO    instance/beat.go:697    Home path: [/apps/was/monitoring/filebeat] Config path: [/apps/was/monitoring/filebeat] Data path: [/apps/was/monitoring/filebeat/data] Logs path: [/apps/was/monitoring/filebeat/logs] Hostfs Path: [/]
2023-02-20T10:50:03.616Z        INFO    instance/beat.go:705    Beat ID: e6b32ed9-8726-49ed-90d5-b9a4a44a7099
2023-02-20T10:50:03.619Z        INFO    [seccomp]       seccomp/seccomp.go:124  Syscall filter successfully installed
2023-02-20T10:50:03.619Z        INFO    [beat]  instance/beat.go:1051   Beat info       {"system_info": {"beat": {"path": {"config": "/apps/was/monitoring/filebeat", "data": "/apps/was/monitoring/filebeat/data", "home": "/apps/was/monitoring/filebeat", "logs": "/apps/was/monitoring/filebeat/logs"}, "type": "filebeat", "uuid": "e6b32ed9-8726-49ed-90d5-b9a4a44a7099"}}}
2023-02-20T10:50:03.619Z        INFO    [beat]  instance/beat.go:1060   Build info      {"system_info": {"build": {"commit": "unknown", "libbeat": "7.17.9", "time": "0001-01-01T00:00:00.000Z", "version": "7.17.9"}}}
2023-02-20T10:50:03.619Z        INFO    [beat]  instance/beat.go:1063   Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":2,"version":"go1.20"}}}
2023-02-20T10:50:03.623Z        INFO    [beat]  instance/beat.go:1067   Host info       {"system_info": {"host": {"architecture":"x86_64","boot_time":"2023-02-01T01:12:56Z","containerized":false,"name":"server","ip":["127.0.0.1/8","10.78.1.1/24"],"kernel_version":"4.18.0-425.10.1.el8_7.x86_64","mac":["00:50:00:00:00:00"],"os":{"type":"linux","family":"redhat","platform":"rhel","name":"Red Hat Enterprise Linux","version":"8.7 (Ootpa)","major":8,"minor":7,"patch":0,"codename":"Ootpa"},"timezone":"GMT","timezone_offset_sec":0,"id":"bd6f5453b88f4346b22c0ff09d38f422"}}}
2023-02-20T10:50:03.624Z        INFO    [beat]  instance/beat.go:1096   Process info    {"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":null,"effective":null,"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"ambient":null}, "cwd": "/apps/was/monitoring/filebeat", "exe": "/apps/was/monitoring/filebeat/filebeat", "name": "filebeat", "pid": 2005922, "ppid": 2005921, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2023-02-20T10:50:02.950Z"}}}
2023-02-20T10:50:03.624Z        INFO    instance/beat.go:291    Setup Beat: filebeat; Version: 7.17.9
2023-02-20T10:50:03.627Z        INFO    [publisher]     pipeline/module.go:113  Beat name: server
2023-02-20T10:50:03.628Z        WARN    beater/filebeat.go:202  Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2023-02-20T10:50:03.632Z        INFO    [monitoring]    log/log.go:142  Starting metrics logging every 30s
2023-02-20T10:50:03.632Z        INFO    instance/beat.go:456    filebeat start running.
2023-02-20T10:50:03.633Z        INFO    memlog/store.go:119     Loading data file of '/apps/was/monitoring/filebeat/data/registry/filebeat' succeeded. Active transaction id=0
2023-02-20T10:50:03.633Z        INFO    memlog/store.go:124     Finished loading transaction log file for '/apps/was/monitoring/filebeat/data/registry/filebeat'. Active transaction id=0
2023-02-20T10:50:03.634Z        WARN    beater/filebeat.go:411  Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2023-02-20T10:50:03.634Z        INFO    [registrar]     registrar/registrar.go:109      States Loaded from registrar: 0
2023-02-20T10:50:03.634Z        INFO    [crawler]       beater/crawler.go:71    Loading Inputs: 0
2023-02-20T10:50:03.634Z        INFO    [crawler]       beater/crawler.go:106   Loading and starting Inputs completed. Enabled inputs: 0
2023-02-20T10:50:03.634Z        INFO    cfgfile/reload.go:164   Config reloader started
2023-02-20T10:50:03.635Z        ERROR   [input] input-logfile/manager.go:183    filestream input with ID 'httpd-request-log' already exists, this will lead to data duplication, please use a different ID
2023-02-20T10:50:03.636Z        INFO    cfgfile/reload.go:224   Loading of config files completed.
2023-02-20T10:50:03.636Z        INFO    [input.filestream]      compat/compat.go:113    Input 'filestream' starting     {"id": "httpd-request-log"}

The main concern being:

2023-02-20T10:50:03.635Z ERROR [input] input-logfile/manager.go:183 filestream input with ID 'httpd-request-log' already exists, this will lead to data duplication, please use a different ID

There is only one input so this shouldn't be displayed, if the contents are directly loaded we don't see this message

botelastic[bot] commented 1 year ago

This issue doesn't have a Team:<team> label.

FireBurn commented 1 year ago

If I keep using "log" rather then "filestream" I see:

2023-02-20T11:18:39.029Z        WARN    [cfgwarn]       log/input.go:89 DEPRECATED: Log input. Use Filestream input instead.
2023-02-20T11:18:39.030Z        INFO    [input] log/input.go:171        Configured paths: [/apps/was/httpd/logs/request.log]      {"input_id": "690947f6-d989-4c2e-bc23-c0ea773bed79"}
2023-02-20T11:18:39.030Z        INFO    [crawler]       beater/crawler.go:106   Loading and starting Inputs completed. Enabled inputs: 0
2023-02-20T11:18:39.033Z        INFO    cfgfile/reload.go:164   Config reloader started
2023-02-20T11:18:39.034Z        INFO    [input] log/input.go:171        Configured paths: [/apps/was/httpd/logs/request.log]      {"input_id": "cecbd7f0-524c-4107-8cf4-dd0881cb4b14"}
2023-02-20T11:18:39.034Z        INFO    cfgfile/reload.go:224   Loading of config files completed.
keenborder786 commented 1 year ago

I also facing a similar error.

The error is:

{"log.level":"error","@timestamp":"2023-03-22T17:11:22.014Z","log.logger":"input","log.origin":{"file.name":"input-logfile/manager.go","file.line":182},"message":"filestream input with ID 'spark-executors-file-stream-e40cd871-dabd-46b5-a245-decac624262a' already exists, this will lead to data duplication, please use a different ID","service.name":"filebeat","ecs.version":"1.6.0"}

My filebeat configuration is:

  filebeat.autodiscover:
    providers:
    - type: kubernetes
      templates:
        - condition:
            equals:
              kubernetes.namespace: "spark"
          config:
            - type: container
              paths:
              - /var/log/containers/*-${data.kubernetes.container.id}.log
              multiline.type: pattern
              multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:|^java'
              multiline.negate: false
              multiline.match: after
            - type: log
              id: spark-executors-file-stream-${data.kubernetes.node.uid}
              paths:
              - /var/log/spark/work/*/*/stdout
              - /var/log/spark/work/*/*/stderr
  processors:
    - add_cloud_metadata:
    - add_host_metadata:
  output.logstash:
    hosts: ["logstash:5044"]

I am using the latest version i.e 8.6.2 for filebeat. I thought the bug has been fixed in #31512.

lgatellier commented 1 year ago

Hi, I'm facing the same problem with filebeat 8.7.0.

yakhatape commented 1 year ago

Same problem on my side. Any solutions ?

ajax-shmyrko-o commented 1 year ago

The same issue with Filebeat 8.6.1.

Might be related to the issue https://github.com/elastic/beats/issues/31767 (fixed in 8.9.0) and to https://github.com/elastic/beats/issues/36379.

ajax-shmyrko-o commented 1 year ago

Upgraded to Filebeat 8.9.1, and the error disappeared.

botelastic[bot] commented 2 weeks ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!