open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

filelog receiver not reading logs from new files automatically using the poll interval #34395

Open pr0PM opened 1 month ago

pr0PM commented 1 month ago

Component(s)

receiver/filelog

What happened?

Description

The filelog receiver is not picking up matching log files unless restarted if new pods with matching pattern get scheduled on the k8s node.

I'll try to explain this with a detailed example: Let's say I have 5 node cluster while my filelog config is matching files corresponding to a deployment with 3 replicas (each in unique node). In this case 3 pods of OTEL collector will start the processing the logs. 2 OTEL collector pods will be idle. Let's say there was a rollout restart for the target deployment and 2 pods got scheduled on new nodes where OTEL pods were idle. In my case the idle OTEL pods don't pick up the logs from the file and stay dormant. I read the filelog receiver has a config for poll_interval which doesn’t seem to be working here in our case.

When the otelcol starts reading logs from a file it logs look something like this if the matching pods are present in the node:

2024-08-02T09:39:52.406Z    info    service@v0.101.0/service.go:102    Setting up own telemetry...
<redacted>
2024-08-02T09:39:52.614Z    info    fileconsumer/file.go:235    Started watching file    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_e13e7264-f7d9-4877-b6d2-7a87533ab742/keycloak/0.log"}
2024-08-02T09:39:52.614Z    info    fileconsumer/file.go:235    Started watching file    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_e13e7264-f7d9-4877-b6d2-7a87533ab742/keycloak/0.log"}

and for idle otecol daemonset pods it looks something like this:

2024-08-02T09:39:58.833Z    info    service@v0.101.0/service.go:102    Setting up own telemetry...
...
2024-08-02T09:39:58.843Z    info    adapter/receiver.go:46    Starting stanza receiver    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs"}
2024-08-02T09:39:58.845Z    warn    fileconsumer/file.go:47    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-02T09:39:58.847Z    info    kube/client.go:113    k8s filtering    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "labelSelector": "", "fieldSelector": ""}
2024-08-02T09:39:58.847Z    info    adapter/receiver.go:46    Starting stanza receiver    {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-08-02T09:39:58.847Z    warn    fileconsumer/file.go:47    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-02T09:39:58.847Z    info    service@v0.101.0/service.go:195    Everything is ready. Begin running and processing data.

Now if the pods switch the node to a node where OTEL collector pod was in dormant state otecol pod doesn't start the processing.

Steps to Reproduce

  1. Make sure you have an deployment/statefulset running and you have more than 1 node in k8s
  2. Deploy OTEL collector that matches pods from a specific deployment/statefulset
  3. Make sure at least 1 OTEL pod remains idle (it has no logs to process) by adjusting the replica count of deployment/statefulset (it should log no files match the configured criteria in that case)
  4. Now remove the deployment pod while ensuring new one that comes up get scheduled on the other node where OTEL collector pod is idle.
  5. Observe how any generated logs in the node with idle otelcol pod even while matching the filelog config won't be processed.

Expected Result

OTEL collector pod should be polling for log files and pick up any new files as they arrive.

Actual Result

OTEL collector doesn't start processing the log files automatically as they arrive unless I forcefully restart the OTEL collector pod in the node where previously no matching log files were present.

Collector version

v0.101.0

Environment information

Environment

OS: Amazon Linux 2 EKS v1.26.12-eks Node

OpenTelemetry Collector configuration

exporters:
  awss3:
    s3uploader:
      compression: gzip
      region: '{{.Values.s3_bucket_region}}'
      s3_bucket: '{{.Values.s3_bucket}}'
      s3_partition: hour
      s3_prefix: logs/keycloak-events
  elasticsearch/log:
    endpoints:
    - http://logging-master.logging.svc.cluster.local:9200
    logs_index: keycloak-logs
    logstash_format:
      enabled: true
      prefix_separator: '-'
    sending_queue:
      enabled: true
      num_consumers: 5
      queue_size: 500
processors:
  batch: {}
  batch/s3:
    timeout: 3s
  transform/awss3:
    log_statements:
      - context: log
        statements:
          - set(attributes) = {}
  filter:
    logs:
      include:
        match_type: regexp
  k8sattributes:
    auth_type: serviceAccount
    extract:
      metadata:
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.deployment.name
      - k8s.namespace.name
      - k8s.node.name
      - k8s.pod.start_time
    filter:
      node_from_env_var: KUBE_NODE_NAME
    passthrough: false
    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
  resource:
    attributes:
    - action: insert
      key: k8s.cluster.name
      value: beta-vc
    - action: insert
      key: k8s.domain.name
      value: beta.atlan.dev
    - action: upsert
      from_attribute: host.name
      key: host.id
  resourcedetection:
    detectors:
    - env
receivers:
  filelog:
    include:
    - /var/log/pods/*keycloak*/keycloak/*.log
    include_file_name: false
    include_file_path: true
    operators:
    - id: get-format
      routes:
      - expr: body matches "^\\{"
        output: parser-docker
      - expr: body matches "^[^ Z]+ "
        output: parser-crio
      - expr: body matches "^[^ Z]+Z.*org.keycloak.events"
        output: parser-containerd
      type: router
    - id: parser-crio
      output: extract_metadata_from_filepath
      regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*org.keycloak.events\].*)$
      timestamp:
        layout: 2006-01-02T15:04:05.999999999Z07:00
        layout_type: gotime
        parse_from: attributes.time
      type: regex_parser
    - id: parser-containerd
      output: extract_metadata_from_filepath
      regex: ^(?P<time>[^ Z]+)Z (?P<stream>stdout|stderr) (?P<logtag>[^ ]) (?:[\d\-\s:,]+)(?P<severity>[^ ]+)(?:.*org\.keycloak\.events\].*\))( type=(?P<type>[^,]+))?,?( realmId=(?P<realmId>[^,]+))?,?( clientId=(?P<clientId>[^,]+))?,?( userId=(?P<userId>[^,]+))?,?( ipAddress=(?P<ipAddress>[^,]+))?,?( auth_method=(?P<auth_method>[^,]+))?,?( audience=(?P<audience>[^,]+))?,?( token_id=(?P<token_id>[^,]+))?,?( grant_type=(?P<grant_type>[^,]+))?,?( refresh_token_type=(?P<refresh_token_type>[^,]+))?,?( scope=(?P<scope>[^,]+))?,?( impersonator=(?P<impersonator>[^,]+))?,?( refresh_token_id=(?P<refresh_token_id>[^,]+))?,?( requested_subject=(?P<requested_subject>[^,]+))?,?( client_auth_method=(?P<client_auth_method>[^,]+))?,?( authSessionParentId=(?P<authSessionParentId>[^,]+))?,?( authSessionTabId=(?P<authSessionTabId>[^\s]+))?,?( auth_type=(?P<auth_type>[^,]+))?( code_id=(?P<code_id>[^,]+))?,?( response_type=(?P<response_type>[^,]+))?,?( redirect_uri=(?P<redirect_uri>[^,]+))?,?( consent=(?P<consent>[^,]+))?,?( username=(?P<username>[^,]+))?,?( operationType=(?P<operationType>[^,]+))?
      timestamp:
        layout: '%Y-%m-%dT%H:%M:%S.%L.%f'
        parse_from: attributes.time
      type: regex_parser
    - id: parser-docker
      output: extract_metadata_from_filepath
      timestamp:
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        parse_from: attributes.time
      type: json_parser
    - from: attributes.log
      to: body
      type: move
    - cache:
        size: 128
      id: extract_metadata_from_filepath
      parse_from: attributes["log.file.path"]
      regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
      type: regex_parser
    - from: attributes.stream
      to: attributes["log.iostream"]
      type: move
    - from: attributes.container_name
      to: resource["k8s.container.name"]
      type: move
    - from: attributes.namespace
      to: resource["k8s.namespace.name"]
      type: move
    - from: attributes.pod_name
      to: resource["k8s.pod.name"]
      type: move
    - from: attributes.restart_count
      to: resource["k8s.container.restart_count"]
      type: move
    - from: attributes.uid
      to: resource["k8s.pod.uid"]
      type: move
    start_at: beginning
service:
  pipelines:
    s3events:
      receivers:
      - filelog
      processors:
      - k8sattributes
      - transform/awss3
      - resource
      - batch/s3
      exporters:
      - awss3
    esevent:
      exporters:
      - elasticsearch/log
      processors:
      - k8sattributes
      - resource
      - batch
      receivers:
      - filelog

Log output

No response

Additional context

No response

github-actions[bot] commented 1 month ago

Pinging code owners:

djaglowski commented 1 month ago

I'm curious how you concluded that this is related to the poll interval. I can't see how that would be the case. The poll interval just defines how often we look for files. If you use any reasonable value then it's effectively just checking for the files repeatedly. If it can't find them, that's because they don't exist or because the collector doesn't have access to them (in which case they might as well not exist as far as the collector is concerned).

pr0PM commented 1 month ago

Not a conclusion but I assumed it might be an issue there since that was something I was able to find in docs, could be something else altogether. Problem being collector is not picking up new matching log files that come up after it failed to find any during startup (as none matching the pattern existed). Then processing them if I force collector to restart.

djaglowski commented 1 month ago

@pr0PM Based on the config it appears there is no storage extension involved. Can you confirm that it the case?

pr0PM commented 1 month ago

@djaglowski that's right no storage extensions. Would using any help us here? I've never tried any of them. I'm parsing Keycloak (2 pods) events from logs and pushing it to s3 and elasticsearch with all the metadata.

djaglowski commented 1 month ago

Using a storage extension is helpful in many cases but for the sake of diagnosing this issue it would only complicate things.

Given that there is no extension, you can be sure there is no state shared between the collectors. Therefore, I think you should look at this issue from the perspective of one node that is not behaving as expected.

If I understand correctly, that means for a "dormant" node:

  1. The collector pod starts up, the collector finds no files.
  2. Another pod is deployed to the node.
  3. The collector does not pick up any log files.
  4. The collector pod is destroyed.
  5. A new collector pod is deployed.

IMO, the behavior you described is very unlikely to be caused by this sequence of events. Much more likely, there is some incorrect assumption about what is actually happening.

  1. Are you 100% certain that the other pod is running on the expected node?
  2. Are you 100% certain that the log file was created at the specified location?
  3. Are you 100% certain that the new collector pod is redeployed to the same node?

If possible, I would consider simplifying this to a 1 node cluster until you get it working. I don't see any reason why having multiple nodes can explain any behavior here unless you're misunderstanding which pods are being deployed to which nodes.

pr0PM commented 1 month ago

image

Trying to capture the process in this image for clarity to explain what I meant by dormant node and answer the questions better.

Here collector is deployed on k8s as a Daemonset so each node will be running a pod of collector by default. My workload is a deployment or statefulset with 2 replicas.

First box:

Second box:

Third Box:

Now for the last 3 questions:

Are you 100% certain that the other pod is running on the expected node?

yes it is as explained above

Are you 100% certain that the log file was created at the specified location?

yes, I even ssh'd into the node to confirm this

Are you 100% certain that the new collector pod is redeployed to the same node?

yes k8s daemonset makes sure it's deployed on each node

djaglowski commented 1 month ago

Thanks for the diagram and detailed answers.

Can you try enabling debug logging for the collector and sharing a more complete log?

service:
  pipelines:
    logs: ...
  telemetry:
    logs:
      level: DEBUG
      encoding: console
pr0PM commented 1 month ago

There were too many logs here so redacted the most redundant parts, please let me know if I should share more detailed ones.

node-1 while target workload is on it

2024-08-05T19:22:32.697Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.699Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.699Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.897Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.897Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.899Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.899Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}

node-1 after removal of the target service

2024-08-05T19:19:57.968Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:57.968Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.168Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.169Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.368Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.368Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.568Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.568Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.769Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.769Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:59.769Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}

node-2 it logs the same thing continuously since startup nothing new here even when the new workload pod gets scheduled here (2nd box)

2024-08-05T19:18:29.932Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:18:29.933Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.933Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.932Z    debug    fileconsumer/file.go:114    finding files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:18:29.933Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.933Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}

node-2 after restart and logging working again (3rd box)

2024-08-05T19:24:32.545Z    info    service@v0.101.0/service.go:102    Setting up own telemetry...
2024-08-05T19:24:32.546Z    info    service@v0.101.0/telemetry.go:103    Serving metrics    {"address": ":8888", "level": "Normal"}
2024-08-05T19:24:32.546Z    debug    exporter@v0.101.0/exporter.go:273    Alpha component. May change in the future.    {"kind": "exporter", "data_type": "logs", "name": "awss3"}
2024-08-05T19:24:32.546Z    debug    exporter@v0.101.0/exporter.go:273    Beta component. May change in the future.    {"kind": "exporter", "data_type": "logs", "name": "elasticsearch/log"}
2024-08-05T19:24:32.546Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "batch", "pipeline": "logs/es"}
2024-08-05T19:24:32.546Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "resource", "pipeline": "logs/es"}
2024-08-05T19:24:32.546Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es"}
2024-08-05T19:24:32.547Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "batch/s3", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "resource", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z    debug    processor@v0.101.0/processor.go:301    Beta component. May change in the future.    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z    debug    receiver@v0.101.0/receiver.go:308    Beta component. May change in the future.    {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-08-05T19:24:32.548Z    debug    regex/config.go:83    configured memory cache    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "size": 128}
2024-08-05T19:24:32.548Z    debug    receiver@v0.101.0/receiver.go:308    Beta component. May change in the future.    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs"}
2024-08-05T19:24:32.548Z    debug    regex/config.go:83    configured memory cache    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "operator_id": "continue_s3_input", "size": 128}
2024-08-05T19:24:32.550Z    info    service@v0.101.0/service.go:169    Starting otelcol-contrib...    {"Version": "0.101.0", "NumCPU": 8}
2024-08-05T19:24:32.550Z    info    extensions/extensions.go:34    Starting extensions...
2024-08-05T19:24:32.551Z    info    kube/client.go:113    k8s filtering    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3", "labelSelector": "", "fieldSelector": ""}
2024-08-05T19:24:32.553Z    info    kube/client.go:113    k8s filtering    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "labelSelector": "", "fieldSelector": ""}
2024-08-05T19:24:32.555Z    info    adapter/receiver.go:46    Starting stanza receiver    {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "log_emitter", "operator_type": "log_emitter"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "log_emitter", "operator_type": "log_emitter"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move6", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move6", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move5", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move5", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move4", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move4", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move3", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move3", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move2", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move2", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move1", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move1", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "operator_type": "regex_parser"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "operator_type": "regex_parser"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move", "operator_type": "move"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:60    Starting operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "parser-docker", "operator_type": "json_parser"}
2024-08-05T19:24:32.555Z    debug    pipeline/directed.go:64    Started operator    {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "parser-docker", "operator_type": "json_parser"}
2024-08-05T19:24:32.556Z    debug    adapter/converter.go:111    Starting log converter    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "converter", "worker_count": 2}
2024-08-05T19:24:32.556Z    info    adapter/receiver.go:46    Starting stanza receiver    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs"}
2024-08-05T19:24:32.556Z    debug    adapter/converter.go:111    Starting log converter    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "converter", "worker_count": 2}
2024-08-05T19:24:32.556Z    info    service@v0.101.0/service.go:195    Everything is ready. Begin running and processing data.
2024-08-05T19:24:32.556Z    warn    localhostgate/featuregate.go:63    The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.    {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-08-05T19:24:32.757Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z    info    fileconsumer/file.go:235    Started watching file    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"}
2024-08-05T19:24:32.757Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z    info    fileconsumer/file.go:235    Started watching file    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"}
2024-08-05T19:24:32.857Z    debug    k8sattributesprocessor@v0.101.0/processor.go:122    evaluating pod identifier    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a"},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-08-05T19:24:32.857Z    debug    k8sattributesprocessor@v0.101.0/processor.go:140    getting the pod    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "pod": {"Name":"keycloak-0","Address":"10.192.118.132","PodUID":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a","Attributes":{"k8s.namespace.name":"keycloak","k8s.node.name":"ip-10-192-171-234.ap-south-1.compute.internal","k8s.pod.name":"keycloak-0","k8s.pod.start_time":"2024-08-05T19:18:07Z","k8s.pod.uid":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a"},"StartTime":"2024-08-05T19:18:07Z","Ignore":false,"Namespace":"keycloak","NodeName":"ip-10-192-171-234.ap-south-1.compute.internal","HostNetwork":false,"Containers":{"ByID":null,"ByName":null},"DeletedAt":"0001-01-01T00:00:00Z"}}
2024-08-05T19:24:32.857Z    debug    k8sattributesprocessor@v0.101.0/processor.go:122    evaluating pod identifier    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-08-05T19:24:32.956Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z    debug    fileconsumer/file.go:116    matched files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z    debug    fileconsumer/file.go:148    Consuming files    {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
djaglowski commented 1 month ago

I am unable to reproduce this on a single node.

rpsadarangani commented 1 month ago

@djaglowski can you try reproducing this with a stateful set if possible, This is where we saw the issue happening.

rpsadarangani commented 1 month ago

@djaglowski the above keycloak pod works as stateful set and not deployment

djaglowski commented 1 month ago

@rpsadarangani I tested with a stateful set and get the same result.


Not sure if related, but do you know why your logs show a malformed list of paths, but only when the list is empty? ("paths": ] vs "paths": ["/var/log/pods/.../keycloak/0.log"])

djaglowski commented 1 month ago

Since I'm unable to reproduce the issue and cannot come up with any theory that explains the described behavior, I'll have to reiterate my request that you reduce the complexity of the scenario. If you can provide a concrete set of k8s specs and commands that demonstrate the issue, I can look into it further.