Open chrono2002 opened 4 months ago
reload only once when all operations in ns finishes
how does fluent-operator know when your operations are done?
in my opinion, you can control the create/delete orders in your CI system and this problem will be resolved.
reload only once when all operations in ns finishes
how does fluent-operator know when your operations are done?
in my opinion, you can control the create/delete orders in your CI system and this problem will be resolved.
how exactly you're suggesting to control it? we have helm chart that simple install parsers, filters and outputs we've tried to place parsers section before filters, or filters section before parsers, no luck
@cw-Guo we use gitops to deploy and when we deploy a bigger application, many fluent-operator CRs gets created that seems to trigger many reload on fluent-bit pods.
This causes troubles for us as fluent-bit starts hanging from time to time (https://github.com/fluent/fluent-operator/issues/1332).
It seems, fluent-bit has some issues with hot reload: https://github.com/fluent/fluent-bit/issues/9354
While, these are most probably fluent-bit bugs, maybe being a bit more "kind" with the reload requests could help.
How about a solution that instead of immediately reload on every CR change, fluent-operator would "collect" the changes for some definable period (like 1 minute) and call a single reload only once if any change has happened during this period.
ping @markusthoemmes
I'm not really active in this project right now, but I did solve this internally eventually. Essentially, I've created a script that gets the current reloads (GET "http://0.0.0.0:2020/api/v2/reload"
) and then runs a hot reload. Afterwards it gets the reloads again. If they are the same as before, retry the reload. The need for that was supposed to be fixed via https://github.com/fluent/fluent-bit/issues/8457 though, so now we should be able to handle the return value of the reload and retry on error.
Describe the issue
We've got CI which deploys filters, parsers and outputs into several namespaces. It works like this: before deployment it deletes everything in namespace.
Started from version 2.7.0 we've got following errors:
Looks like it is reloading on every object deletion. And when parsers are deleted before filters, it stucks and crashes. Then restarts normally.
To Reproduce
Expected behavior
Your Environment
How did you install fluent operator?
helm
Additional context
No response