kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
7 stars 0 forks source link

Week 37 2024 routine #251

Closed kiwixbot closed 2 months ago

kiwixbot commented 2 months ago

Check nodes free space

df -h / && df -h /data

Nodes system upgrades

apt update && apt upgrade

(regular workers updates are done separately on a monthly basis for worker nodes to not impact production)

Backups

k8s cluster

Stats

matomo - stats.kiwix.org

Grafana

Projects

Security

Note: this is an automatic reminder intended for the assignee(s).

benoit74 commented 2 months ago

Storage

Machine Filesystem Size Used Avail Use% Use change
bastion / 37G 15G 21G 42% +1G, +2%
stats / 233G 109G 113G 50% +1G
services / 456G 311G 122G 72% +2G
storage / 147G 18G 122G 13% NEW
storage /data 30T 18T 12T 62% NEW
imager-worker / 1.9T 452G 1.4T 26% don't care
sisyphus / 233G 23G 198G 11% don't care
ondemand / 25G 9.7G 14G 42% -
ondemand /data 216G 204M 205G 1% don't care
mirrors-qa / 38G 3.6G 33G 10% -
demo / 40G 9.4G 28G 26% -
demo /data 1.8T 925G 739G 56% don't care

k8s

Significant issue of pod restarts occured on bastion node (scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd) 4d18h ago and smaller one on stats node (scw-kiwix-prod-foreign-7aaa98d57ede4cf0ba95d14) 5d22h ago

cert-manager     cert-manager-cainjector-9d956987c-mj7x5                      1/1     Running     36 (4d18h ago)   41d     100.64.2.125      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
cert-manager     cert-manager-fdd97855b-m8stj                                 1/1     Running     4 (4d18h ago)    41d     100.64.2.126      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
cert-manager     cert-manager-webhook-9f799c7d7-8lkrp                         1/1     Running     1 (4d18h ago)    41d     100.64.2.124      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
debug            debug-network-tools-daemonset-7mf9c                          1/1     Running     1 (4d18h ago)    371d    100.64.2.127      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
grafana          grafana-k8s-monitoring-alloy-0                               2/2     Running     2 (4d18h ago)    6d18h   100.64.2.130      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
grafana          grafana-k8s-monitoring-alloy-events-6b8df4555c-k92wb         2/2     Running     2 (4d18h ago)    39d     100.64.2.128      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
grafana          grafana-k8s-monitoring-alloy-logs-vkd55                      2/2     Running     2 (4d18h ago)    39d     100.64.2.123      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
grafana          grafana-k8s-monitoring-kube-state-metrics-78c4cb9dd7-g76w5   1/1     Running     39 (5d22h ago)   8d      100.64.7.107      scw-kiwix-prod-foreign-7aaa98d57ede4cf0ba95d14   <none>           <none>
grafana          grafana-k8s-monitoring-prometheus-node-exporter-lf7q5        1/1     Running     1 (4d18h ago)    32d     10.200.106.85     scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
ingress-nginx    ingress-nginx-controller-s7bv8                               1/1     Running     1 (4d18h ago)    39d     100.64.2.129      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
kube-system      coredns-575bf8666d-b7gc4                                     1/1     Running     1 (4d18h ago)    313d    100.64.2.121      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
kube-system      kilo-x6mlz                                                   1/1     Running     1 (4d18h ago)    158d    10.200.106.85     scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
kube-system      konnectivity-agent-5fvrp                                     1/1     Running     1 (4d18h ago)    41d     10.200.106.85     scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
kube-system      kube-proxy-52kp9                                             1/1     Running     1 (4d18h ago)    41d     10.200.106.85     scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
kube-system      metrics-server-c84d88667-wbwpq                               1/1     Running     1 (4d18h ago)    6d18h   100.64.2.122      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>
zimit            cm-acme-http-solver-6nll7                                    1/1     Running     1 (4d18h ago)    107d    100.64.2.120      scw-kiwix-prod-foreign-fd47028c64314d1d98b4cbd   <none>           <none>

No explanation found in the logs regarding why all these pods restarted.

SSL.com

1035 signatures left (out of 1200) @rgaudin is this normal? looks like we are close to exhaust the signatures before the end of the year

rgaudin commented 2 months ago

1035 signatures left (out of 1200) @rgaudin is this normal? looks like we are close to exhaust the signatures before the end of the year

It's the number of signatures left and the year is August to August. We've used under 200 already because of the setup

benoit74 commented 2 months ago

zimit