kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
7 stars 0 forks source link

Problem of slowness on `storage` is back #284

Closed benoit74 closed 4 weeks ago

benoit74 commented 1 month ago

It is only very early observations so far, but I see that all jobs monitored on storage node (update of mirrorbrain DB, library and dev library generation) are taking longer since 13 Oct. 2024 around 5am UTC.

It is close to be an issue for library which takes about 1h now.

benoit74 commented 1 month ago

Probably not an issue indeed, the checkarray utility of mdadm started at 4:30am UTC this same day, and it is running since then. It is currently at 75% completion. It moved the big /dev/md3 into checking status, probably consuming lots of IO. Let's wait for it to complete to gather more data. Would be interesting to check how the move to SSD cache (#246) can or cannot "hide" this perf degradation.

benoit74 commented 4 weeks ago

Problem is gone, it was indeed only linked to raid maintenance operations. Good to know it takes days on this array. Would be very interesting to see the impact of SSD cache on next maintenance once in place.