open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.07k stars 2.37k forks source link

no files are being scraped when `exclude_older_than` is enabled #32681

Closed newly12 closed 5 months ago

newly12 commented 6 months ago

Component(s)

pkg/stanza

What happened?

Description

When enable exclude_older_than in filelog receiver without any ordering_criteria configs, no files are being scraped regardless it is new file or old files.

Steps to Reproduce

  1. enable exclude_older_than in filelog receiver config, do not configure any ordering_criteria
  2. create a new log file matching the included path

Expected Result

files newer than this should be scraped

Actual Result

no files are scraped

Collector version

v0.99.0

Environment information

Environment

OS: MacOS Compiler(if manually compiled): go 1.21

OpenTelemetry Collector configuration

exporters:
  logging:
    loglevel: debug

receivers:
  filelog:
    include: [ /tmp/1.log ]
    exclude_older_than: 5m

service:
  telemetry:
    logs:
      level: "debug"
  pipelines:
    logs:
      receivers:
        - filelog
      exporters:
        - logging

Log output

2024-04-25T10:11:03.828+0800    info    service@v0.99.0/service.go:99   Setting up own telemetry...
2024-04-25T10:11:03.829+0800    info    service@v0.99.0/telemetry.go:103        Serving metrics {"address": ":8888", "level": "Normal"}
2024-04-25T10:11:03.829+0800    info    exporter@v0.99.0/exporter.go:275        Deprecated component. Will be removed in future releases.       {"kind": "exporter", "data_type": "logs", "name": "logging"}
2024-04-25T10:11:03.829+0800    warn    common/factory.go:68    'loglevel' option is deprecated in favor of 'verbosity'. Set 'verbosity' to equivalent value to preserve behavior.      {"kind": "exporter", "data_type": "logs", "name": "logging", "loglevel": "debug", "equivalent verbosity level": "Detailed"}
2024-04-25T10:11:03.829+0800    debug   receiver@v0.99.0/receiver.go:308        Beta component. May change in the future.       {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-04-25T10:11:03.830+0800    info    service@v0.99.0/service.go:166  Starting otelcontribcol...      {"Version": "v0.1.41-histogram-workaround-and-examplars", "NumCPU": 10}
2024-04-25T10:11:03.830+0800    info    extensions/extensions.go:34     Starting extensions...
2024-04-25T10:11:03.830+0800    info    adapter/receiver.go:45  Starting stanza receiver        {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-04-25T10:11:03.830+0800    debug   pipeline/directed.go:59 Starting operator       {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-04-25T10:11:03.830+0800    debug   pipeline/directed.go:63 Started operator        {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-04-25T10:11:03.830+0800    debug   pipeline/directed.go:59 Starting operator       {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "file_input", "operator_type": "file_input"}
2024-04-25T10:11:03.830+0800    debug   pipeline/directed.go:63 Started operator        {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "file_input", "operator_type": "file_input"}
2024-04-25T10:11:03.830+0800    debug   adapter/converter.go:109        Starting log converter  {"kind": "receiver", "name": "filelog", "data_type": "logs", "worker_count": 2}
2024-04-25T10:11:03.830+0800    info    service@v0.99.0/service.go:192  Everything is ready. Begin running and processing data.
2024-04-25T10:11:03.830+0800    warn    localhostgate/featuregate.go:63 The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.       {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-04-25T10:11:04.031+0800    debug   fileconsumer/file.go:112        matched files   {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "paths": []}
2024-04-25T10:11:04.031+0800    debug   fileconsumer/file.go:144        Consuming files{paths 1 0  []}  {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer"}

Additional context

log file is created just before starting otel collector

$ ls -l 1.log                                                                                                                                                                                                                                                                                                                                                                                                                                                                             (11/monitoring)
-rw-r--r--@ 1 xxx  xxx  4 Apr 25 10:10 1.log
github-actions[bot] commented 6 months ago

Pinging code owners:

%s See Adding Labels via Comments if you do not have permissions to add labels yourself.

newly12 commented 6 months ago

matcher is initialized with TopN to be 0 https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.99.0/pkg/stanza/fileconsumer/matcher/matcher.go#L79-L85

when matching happens, it returns result[:0] https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.99.0/pkg/stanza/fileconsumer/matcher/matcher.go#L186

newly12 commented 6 months ago

I think ordering_criteria and exclude_older_than should be working independently, say I've cases only want to filter old logs, all new logs should be read.

crobert-1 commented 6 months ago

Removing needs triage based on code owner's approval of the PR.

ChrsMark commented 5 months ago

Hey @newly12 , since https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32683 is merged can we consider this one as fixed?

crobert-1 commented 5 months ago

Resolved by #32683