Closed gedimin45 closed 7 years ago
So the undefined method '[]' for nil:NilClass
error from fluentd is a well-known unidentified upstream bug: https://github.com/fluent/fluentd/issues/1248
As for the memory consumption increase, given that fluentd and nsq needs to process all messages I'd imagine that'd stay under high memory consumption until the messages have been processed. Your throughput is simply slower than your input. Is there some form of enhancement you'd like to see from that?
The memory consumption did not go down one bit even after a day after the spike.
I think that when the undefined method
error occurs, the object is not removed from memory but that is just a guess.
Would it be possible to predict a memory limit for the fluentd pod? Perhaps set a sane default in the chart?
it is commented out by default for clusters that want to run the monitoring stack with unlimited resources, but it is there. "sane defaults" differ from cluster to cluster, so there's no silver bullet.
Got it. Pretty sure this is a bug in fluentd but I do not feel like debugging Ruby code that might have a bug under high load 😄 Thanks for the suggestions!
I deployed and app that output logs in a rapid pace (10s of message per second) and the fluentd pod memory consumption increased. Makes sense, but after I killed the app that was spamming logs, the memory consumption stays high. Seems like some messages just stay in memory. I have also encountered this in the logs: