Open pr0PM opened 1 month ago
Pinging code owners:
receiver/filelog: @djaglowski
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I'm curious how you concluded that this is related to the poll interval. I can't see how that would be the case. The poll interval just defines how often we look for files. If you use any reasonable value then it's effectively just checking for the files repeatedly. If it can't find them, that's because they don't exist or because the collector doesn't have access to them (in which case they might as well not exist as far as the collector is concerned).
Not a conclusion but I assumed it might be an issue there since that was something I was able to find in docs, could be something else altogether. Problem being collector is not picking up new matching log files that come up after it failed to find any during startup (as none matching the pattern existed). Then processing them if I force collector to restart.
@pr0PM Based on the config it appears there is no storage extension involved. Can you confirm that it the case?
@djaglowski that's right no storage extensions. Would using any help us here? I've never tried any of them. I'm parsing Keycloak (2 pods) events from logs and pushing it to s3 and elasticsearch with all the metadata.
Using a storage extension is helpful in many cases but for the sake of diagnosing this issue it would only complicate things.
Given that there is no extension, you can be sure there is no state shared between the collectors. Therefore, I think you should look at this issue from the perspective of one node that is not behaving as expected.
If I understand correctly, that means for a "dormant" node:
IMO, the behavior you described is very unlikely to be caused by this sequence of events. Much more likely, there is some incorrect assumption about what is actually happening.
If possible, I would consider simplifying this to a 1 node cluster until you get it working. I don't see any reason why having multiple nodes can explain any behavior here unless you're misunderstanding which pods are being deployed to which nodes.
Trying to capture the process in this image for clarity to explain what I meant by dormant node and answer the questions better.
Here collector is deployed on k8s as a Daemonset so each node will be running a pod of collector by default. My workload is a deployment or statefulset with 2 replicas.
First box:
/var/log/pods/....
"error": "no files match the configured criteria"
in collector logsSecond box:
info fileconsumer/file.go:235 Started watching file
and start processing itThird Box:
Now for the last 3 questions:
Are you 100% certain that the other pod is running on the expected node?
yes it is as explained above
Are you 100% certain that the log file was created at the specified location?
yes, I even ssh'd into the node to confirm this
Are you 100% certain that the new collector pod is redeployed to the same node?
yes k8s daemonset makes sure it's deployed on each node
Thanks for the diagram and detailed answers.
Can you try enabling debug logging for the collector and sharing a more complete log?
service:
pipelines:
logs: ...
telemetry:
logs:
level: DEBUG
encoding: console
There were too many logs here so redacted the most redundant parts, please let me know if I should share more detailed ones.
node-1 while target workload is on it
2024-08-05T19:22:32.697Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.699Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.699Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.897Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.897Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.899Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
2024-08-05T19:22:32.899Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_65eff8a6-7e51-4437-9cf8-dfbd5b836040/keycloak/0.log"]}
node-1 after removal of the target service
2024-08-05T19:19:57.968Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:57.968Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.168Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.169Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.368Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.368Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.568Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.568Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.769Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:58.769Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:19:59.769Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:19:59.769Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
node-2 it logs the same thing continuously since startup nothing new here even when the new workload pod gets scheduled here (2nd box)
2024-08-05T19:18:29.932Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:18:29.933Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.933Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.932Z debug fileconsumer/file.go:114 finding files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "error": "no files match the configured criteria"}
2024-08-05T19:18:29.933Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
2024-08-05T19:18:29.933Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ]}
node-2 after restart and logging working again (3rd box)
2024-08-05T19:24:32.545Z info service@v0.101.0/service.go:102 Setting up own telemetry...
2024-08-05T19:24:32.546Z info service@v0.101.0/telemetry.go:103 Serving metrics {"address": ":8888", "level": "Normal"}
2024-08-05T19:24:32.546Z debug exporter@v0.101.0/exporter.go:273 Alpha component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "awss3"}
2024-08-05T19:24:32.546Z debug exporter@v0.101.0/exporter.go:273 Beta component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "elasticsearch/log"}
2024-08-05T19:24:32.546Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "batch", "pipeline": "logs/es"}
2024-08-05T19:24:32.546Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "resource", "pipeline": "logs/es"}
2024-08-05T19:24:32.546Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es"}
2024-08-05T19:24:32.547Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "batch/s3", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "resource", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z debug processor@v0.101.0/processor.go:301 Beta component. May change in the future. {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3"}
2024-08-05T19:24:32.547Z debug receiver@v0.101.0/receiver.go:308 Beta component. May change in the future. {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-08-05T19:24:32.548Z debug regex/config.go:83 configured memory cache {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "size": 128}
2024-08-05T19:24:32.548Z debug receiver@v0.101.0/receiver.go:308 Beta component. May change in the future. {"kind": "receiver", "name": "filelog/s3", "data_type": "logs"}
2024-08-05T19:24:32.548Z debug regex/config.go:83 configured memory cache {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "operator_id": "continue_s3_input", "size": 128}
2024-08-05T19:24:32.550Z info service@v0.101.0/service.go:169 Starting otelcol-contrib... {"Version": "0.101.0", "NumCPU": 8}
2024-08-05T19:24:32.550Z info extensions/extensions.go:34 Starting extensions...
2024-08-05T19:24:32.551Z info kube/client.go:113 k8s filtering {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3", "labelSelector": "", "fieldSelector": ""}
2024-08-05T19:24:32.553Z info kube/client.go:113 k8s filtering {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "labelSelector": "", "fieldSelector": ""}
2024-08-05T19:24:32.555Z info adapter/receiver.go:46 Starting stanza receiver {"kind": "receiver", "name": "filelog", "data_type": "logs"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "log_emitter", "operator_type": "log_emitter"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "log_emitter", "operator_type": "log_emitter"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move6", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move6", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move5", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move5", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move4", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move4", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move3", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move3", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move2", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move2", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move1", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move1", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "operator_type": "regex_parser"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "extract_metadata_from_filepath", "operator_type": "regex_parser"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "move", "operator_type": "move"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:60 Starting operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "parser-docker", "operator_type": "json_parser"}
2024-08-05T19:24:32.555Z debug pipeline/directed.go:64 Started operator {"kind": "receiver", "name": "filelog", "data_type": "logs", "operator_id": "parser-docker", "operator_type": "json_parser"}
2024-08-05T19:24:32.556Z debug adapter/converter.go:111 Starting log converter {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "converter", "worker_count": 2}
2024-08-05T19:24:32.556Z info adapter/receiver.go:46 Starting stanza receiver {"kind": "receiver", "name": "filelog/s3", "data_type": "logs"}
2024-08-05T19:24:32.556Z debug adapter/converter.go:111 Starting log converter {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "converter", "worker_count": 2}
2024-08-05T19:24:32.556Z info service@v0.101.0/service.go:195 Everything is ready. Begin running and processing data.
2024-08-05T19:24:32.556Z warn localhostgate/featuregate.go:63 The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-08-05T19:24:32.757Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z info fileconsumer/file.go:235 Started watching file {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"}
2024-08-05T19:24:32.757Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.757Z info fileconsumer/file.go:235 Started watching file {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "path": "/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"}
2024-08-05T19:24:32.857Z debug k8sattributesprocessor@v0.101.0/processor.go:122 evaluating pod identifier {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a"},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-08-05T19:24:32.857Z debug k8sattributesprocessor@v0.101.0/processor.go:140 getting the pod {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/es", "pod": {"Name":"keycloak-0","Address":"10.192.118.132","PodUID":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a","Attributes":{"k8s.namespace.name":"keycloak","k8s.node.name":"ip-10-192-171-234.ap-south-1.compute.internal","k8s.pod.name":"keycloak-0","k8s.pod.start_time":"2024-08-05T19:18:07Z","k8s.pod.uid":"ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a"},"StartTime":"2024-08-05T19:18:07Z","Ignore":false,"Namespace":"keycloak","NodeName":"ip-10-192-171-234.ap-south-1.compute.internal","HostNetwork":false,"Containers":{"ByID":null,"ByName":null},"DeletedAt":"0001-01-01T00:00:00Z"}}
2024-08-05T19:24:32.857Z debug k8sattributesprocessor@v0.101.0/processor.go:122 evaluating pod identifier {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/s3", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-08-05T19:24:32.956Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z debug fileconsumer/file.go:116 matched files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
2024-08-05T19:24:32.956Z debug fileconsumer/file.go:148 Consuming files {"kind": "receiver", "name": "filelog/s3", "data_type": "logs", "component": "fileconsumer", "component": "fileconsumer", "paths": ["/var/log/pods/keycloak_keycloak-0_ee75b8d6-5630-4be3-84fd-d0a4fe6cd30a/keycloak/0.log"]}
I am unable to reproduce this on a single node.
@djaglowski can you try reproducing this with a stateful set if possible, This is where we saw the issue happening.
@djaglowski the above keycloak pod works as stateful set and not deployment
@rpsadarangani I tested with a stateful set and get the same result.
Not sure if related, but do you know why your logs show a malformed list of paths, but only when the list is empty? ("paths": ]
vs "paths": ["/var/log/pods/.../keycloak/0.log"]
)
Since I'm unable to reproduce the issue and cannot come up with any theory that explains the described behavior, I'll have to reiterate my request that you reduce the complexity of the scenario. If you can provide a concrete set of k8s specs and commands that demonstrate the issue, I can look into it further.
Component(s)
receiver/filelog
What happened?
Description
The filelog receiver is not picking up matching log files unless restarted if new pods with matching pattern get scheduled on the k8s node.
I'll try to explain this with a detailed example: Let's say I have 5 node cluster while my filelog config is matching files corresponding to a deployment with 3 replicas (each in unique node). In this case 3 pods of OTEL collector will start the processing the logs. 2 OTEL collector pods will be idle. Let's say there was a rollout restart for the target deployment and 2 pods got scheduled on new nodes where OTEL pods were idle. In my case the idle OTEL pods don't pick up the logs from the file and stay dormant. I read the filelog receiver has a config for poll_interval which doesn’t seem to be working here in our case.
When the otelcol starts reading logs from a file it logs look something like this if the matching pods are present in the node:
and for idle otecol daemonset pods it looks something like this:
Now if the pods switch the node to a node where OTEL collector pod was in dormant state otecol pod doesn't start the processing.
Steps to Reproduce
no files match the configured criteria
in that case)Expected Result
OTEL collector pod should be polling for log files and pick up any new files as they arrive.
Actual Result
OTEL collector doesn't start processing the log files automatically as they arrive unless I forcefully restart the OTEL collector pod in the node where previously no matching log files were present.
Collector version
v0.101.0
Environment information
Environment
OS: Amazon Linux 2 EKS v1.26.12-eks Node
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response