ocf / puppet

Puppet config for OCF servers and lab machines
https://www.ocf.berkeley.edu/
31 stars 71 forks source link

fix disk pressure with /run/ symlink to work around string comparison #1301

Closed nikhiljha closed 2 years ago

nikhiljha commented 2 years ago

As explained in https://github.com/kubernetes/kubernetes/issues/106957#issuecomment-1167388634 -- the CRI-O team found performance issues in the CRI stats provider, so there's a hack in kubelet which falls back to cadvisor when crio is being used.

This is currently broken and causing a bunch of spam in our logs that looks like "Unable to fetch pod log stats" -- leading to some nodes (like jaws, with a 64G disk) to become NotReady due to disk pressure.

This commit works around the above issue by subverting the string check using the fact that /run is symlinked to /var/run.

ocfjenkins[bot] commented 2 years ago

Errored hosts (0)

Changed hosts (4)

Unaffected hosts (42)


Changed hosts
diff for hozer-74.ocf.berkeley.edu, jaws.ocf.berkeley.edu, lockdown.ocf.berkeley.edu, pandemic.ocf.berkeley.edu ```diff ******************************************* File[/etc/systemd/system/kubelet.service] => parameters => content => @@ -8,5 +8,5 @@ ExecStart=/usr/bin/kubelet --config /etc/kubernetes/kubelet.yaml \ --container-runtime=remote \ - --container-runtime-endpoint=unix:///var/run/crio/crio.sock \ + --container-runtime-endpoint=unix:///run/crio/crio.sock \ --register-node=true \ --kubeconfig=/etc/kubernetes/kubelet.conf ******************************************* Ocf::Systemd::Service[kubelet] => parameters => content => @@ -8,5 +8,5 @@ ExecStart=/usr/bin/kubelet --config /etc/kubernetes/kubelet.yaml \ --container-runtime=remote \ - --container-runtime-endpoint=unix:///var/run/crio/crio.sock \ + --container-runtime-endpoint=unix:///run/crio/crio.sock \ --register-node=true \ --kubeconfig=/etc/kubernetes/kubelet.conf ******************************************* ```
Unaffected hosts ``` afterhours.ocf.berkeley.edu anthrax.ocf.berkeley.edu autocrat.ocf.berkeley.edu bedbugs.ocf.berkeley.edu bigrip.ocf.berkeley.edu biohazard.ocf.berkeley.edu blizzard.ocf.berkeley.edu corruption.ocf.berkeley.edu coup.ocf.berkeley.edu dataloss.ocf.berkeley.edu deadlock.ocf.berkeley.edu death.ocf.berkeley.edu dementors.ocf.berkeley.edu democracy.ocf.berkeley.edu fallingrocks.ocf.berkeley.edu falsevacuum.ocf.berkeley.edu fire.ocf.berkeley.edu firestorm.ocf.berkeley.edu flood.ocf.berkeley.edu fraud.ocf.berkeley.edu gridlock.ocf.berkeley.edu hellfire.ocf.berkeley.edu lethe.ocf.berkeley.edu lightning.ocf.berkeley.edu maelstrom.ocf.berkeley.edu nuke.ocf.berkeley.edu pestilence.ocf.berkeley.edu pileup.ocf.berkeley.edu pox.ocf.berkeley.edu quarantine.ocf.berkeley.edu reaper.ocf.berkeley.edu riptide.ocf.berkeley.edu scurvy.ocf.berkeley.edu segfault.ocf.berkeley.edu shipwreck.ocf.berkeley.edu solarflare.ocf.berkeley.edu supernova.ocf.berkeley.edu tornado.ocf.berkeley.edu tsunami.ocf.berkeley.edu vampires.ocf.berkeley.edu war.ocf.berkeley.edu worm.ocf.berkeley.edu ```

Jenkins