Open PRNDA opened 1 month ago
Hello @PRNDA,
thank you for reporting your issue with us.
Could you please upload the inspection report that you have created under /var/snap/microk8s/4916/inspection-report-20240924_162747.tar.gz
please? With this information we can better assist you to resolve the issue.
Thank you!
Hello @PRNDA,
thank you for reporting your issue with us.
Could you please upload the inspection report that you have created under
/var/snap/microk8s/4916/inspection-report-20240924_162747.tar.gz
please? With this information we can better assist you to resolve the issue.Thank you!
I created this inspection report yesterday, but I found some sensitive information in the logs, so I decided not to upload it here, Is there a way that I can send it to you privately?
Hi @PRNDA,
how would you prefer to share it? Would you be able to upload the inspection report somewhere we could pull it from?
Hi @louiseschmidtgen ,
I created a private repo here, and uploaded the inspection file into this repo, could you please accept my repo invitation first and then download this inspection file?
Sorry for the inconvenience.
Hello @PRNDA ,
I have received your invitation and have access to the logs.
Thank you for sharing the inspection report, I will be having a look shortly.
Linking this issue as possibly related: https://github.com/canonical/microk8s/issues/4293
Hello @PRNDA,
are you able to reproduce this issue on a more recent MicroK8s snap? You are currently running on v1.23 which is out of support.
With kind regards, Louise
Hello @PRNDA,
are you able to reproduce this issue on a more recent MicroK8s snap? You are currently running on v1.23 which is out of support.
With kind regards, Louise
I'm afraid I can not, this is a production system, and I'm not allowed to upgrade it.
Have you tried deleting Calico-Node pods?
Have you tried deleting Calico-Node pods?
Will this interrupt the running pods?
Have you tried deleting Calico-Node pods?
Will this interrupt the running pods?
Deleting the Calico-Node pods should not interrupt the execution of other pods, as Kubernetes will automatically re-schedule new Calico-Node pods to maintain network connectivity. However, there might be a temporary disruption in pod networking while the new Calico pods start.
Have you tried deleting Calico-Node pods?
Will this interrupt the running pods?
Deleting the Calico-Node pods should not interrupt the execution of other pods, as Kubernetes will automatically re-schedule new Calico-Node pods to maintain network connectivity. However, there might be a temporary disruption in pod networking while the new Calico pods start.
There might be a temporary disruption in pod networking
That's what I'm worried about, as this cluster is running several online systems, I don't want them to be affected.
Summary
We have a 4-node Microk8s HA Cluster running for 2 years, recently we found that the "microk8s.daemon-kubelite" service on all nodes produces tons of error logs like this:
I tried to restart this service by running
systemctl restart snap.microk8s.daemon-kubelite
but it did not help, searched this error message around the web but did not find anything helpful.All pods seem running fine, and I am still able to update our deployments (but the update progress is much slower than before).
Can someone help me resolve this problem?
Cluster status:
microk8s inspect: