I have a kubeadm k8s cluster with 5 nodes. I am able to perform kubectl log on 4 out of 5 are working properly except one of them.
One that problematic node: sudo wg returns the follows:
user@ubuntuserver3:~$ sudo wg
interface: kilo0
public key: <somekey>
private key: (hidden)
listening port: <somePort>
Which is strange as on other nodes I'm able to see a list of peers. At the beginning I thought this might be due to the host itself, so I wiped the VM and reinstalled ubuntu 20.04 on it. However, the issue remains.
I took a deeper look into the issue and found the following
Running sudo wg within 60s after the kilo Pod is restarted actually return some peers. But they will all be gone at ~ >60 s.
I then tried to debug the source code and found
On the problematic node, the Ready()function is retuning false for the other normal hosts.
The function returned false because this check is failed time.Now().Unix()-n.LastSeen < int64(checkInPeriod)*2/int64(time.Second)
My understanding is: here we are checking if the Nodes has been seen the the past 60s. (Default value of checkInPeriod *2 )
I tried to put down some extra logging lines in the problematic host. And I'm surprise to that the LastSeen value for the other nodes are all ~60-80s from UTC.Now, so those nodes are actually treated as non-ready nodes, thus they are not added as peers into wg conf.
I tried to find out why they are all seen at 60-80s ago, and found this cache update related logic are set to 5mins, however I'm not sure if that is related to my issue.
But I probably missed something, as this issue is not observed on 4 out of 5 of my nodes.
Can you kindly point out, beside that cache update related logic that I found. Is there other logic that are also updating the LastSeen value for the nodes?
Thank you very much, and this is indeed a great project!
I have a kubeadm k8s cluster with 5 nodes. I am able to perform
kubectl log
on 4 out of 5 are working properly except one of them. One that problematic node:sudo wg
returns the follows:Which is strange as on other nodes I'm able to see a list of peers. At the beginning I thought this might be due to the host itself, so I wiped the VM and reinstalled ubuntu 20.04 on it. However, the issue remains.
I took a deeper look into the issue and found the following
sudo wg
within 60s after thekilo
Pod is restarted actually return some peers. But they will all be gone at ~ >60 s.Ready()
function is retuningfalse
for the other normal hosts.false
because this check is failedtime.Now().Unix()-n.LastSeen < int64(checkInPeriod)*2/int64(time.Second)
LastSeen
value for the other nodes are all ~60-80s from UTC.Now, so those nodes are actually treated as non-ready nodes, thus they are not added as peers into wg conf.But I probably missed something, as this issue is not observed on 4 out of 5 of my nodes. Can you kindly point out, beside that cache update related logic that I found. Is there other logic that are also updating the LastSeen value for the nodes?
Thank you very much, and this is indeed a great project!
Regards, MockyJoke
My kilo manifest: