Closed clintkev251 closed 1 year ago
@clintkev251: Thanks for opening an issue, it is currently awaiting triage.
In the meantime, you can:
@clintkev251 Did you ever find a solution to this problem?
I found a suitable workaround at least. I closed this issue mostly because my further research led me to believe that this issue was more related to how the helm chart is set up rather than how Crowdsec actually watches logs. My workaround for the moment which has been rock solid was to add poll_without_inotify: true
to each acquisition file source. It is noted that this can increase CPU usage, however I didn't notice much of an impact after some study so I'm happy enough with it. This appears to be a newer option which is not currently supported by the helm chart, so I've opened a pull request over there to add this to the schema for additionalAcquisition file types https://github.com/crowdsecurity/helm-charts/pull/109 and ideally it can also be added as an option to the automatically configured acquisitions.
@clintkev251 Thanks for the response! Looking forward to your PR being merged.
What happened?
I've recently migrated to the official Crowdsec helm chart for my deployment of Crowdsec in k3s, and everything was working great, for the first 12 hours or so. Then I noticed one by one each of my agent pods stopped recording any acquisitions. On restarting the pods, they began to work again, however after some time, they failed again. Digging deeper, this is occurring when k3s rotates the container log. When this occurs, the agent pod emits the following logs:
I can still exec into the pod and manually tail that log and see all the new lines coming in, but Crowdsec is never able to pick back up. The logs contained in /var/log/containers are symlinks to the actual log files which are in /var/log/pods/// so it's possible this is part of the issue.
What did you expect to happen?
Crowdsec should be able to reopen the log file after the logrotation has completed.
How can we reproduce it (as minimally and precisely as possible)?
Use the official helm chart and add a pod to the acquisition config, wait until the log reaches the maximum size configured for your cluster (10 MB by default) and for k3s to rotate it. After the log has been rotated, observe if the Crowdsec pod on that node is still picking up acquisitions.
Anything else we need to know?
No response
Crowdsec version
OS version
Enabled collections and parsers
Acquisition config
Config show
Prometheus metrics
Related custom configs versions (if applicable) : notification plugins, custom scenarios, parsers etc.