Closed NicoledeGreef closed 2 months ago
Zookeeper does not have any memory requests and limits defined within Solr or MFIN data catalogue helm charts values. As seen, in the capture Zookeeper keeps restarting after reaching default memory limits. As Apache Solr is dependent upon Zookeeper, it keeps restarting as well.
Possible remediation would be to define memory requests and limits for Zookeeper within various namespaces.
@chrislaick As discussed, I have changed the Zookeeper pod memory limits from 256Mi to 512Mi. Zookeeper and Apache Solr pods would be monitored for any restarts and memory utilization patterns.
@kardamk Looking into Sysdig after the change, zookeeper pod has stabilized with memory usage around 400MiB. Amazing to see the restarts were happening every 20 minutes. Let's make the changes to TEST and PROD if not already.
@chrislaick I was monitoring the memory usage for Zookeeper pod, once it had stabilized, I synced the changes to TEST and PROD environments as well.
Looks like we can close this one? :)
Looks like we can close this one? :)
Yes, the issue for crashlooping zookeeper and solr pods has been resolved after the change.
Thanks very much @kardamk
Namespace contacts are receiving emails from Platform Services re: "Your action required: Crashlooping pods in Silver/ea352d-test" (for dev as well).
This ticket is for taking a deeper dive on determining the relationship between zookeeper and Solr. When the issue has been looked into, zookeeper had cleaned up any errors so nothing was evident.