Closed 09cicada closed 2 months ago
@09cicada Hi, could you provide the output from following commands?
which kubectl
ls -l $(which kubectl)
ls -l ~/.kube/config
Also please check if following command works.
kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get pods --all-namespaces
Some additional info. The AWX environment still functions although very slowly and crashes intermittently. I am hoping this is something simple.
It is not usual. The OS may be under heavy load or storage may be strained, or any other reasons. Do the top
, free
, df
commands show any abnormalities in resoure usages?
Does the situation remain the same after restarting the OS?
Hello, please see my output below and thank you.
ansible ~]# ls -l which kubectl
-rwxr-xr-x. 1 root root 45015040 Nov 24 2022 /usr/local/bin/kubectl
ansible ~]# ls -l $(which kubectl) -rwxr-xr-x. 1 root root 45015040 Nov 24 2022 /usr/local/bin/kubectl
ansible ~]# ls -l ~/.kube/config -rw-------. 1 root root 2980 Nov 24 2022 /root/.kube/config
This is promising in comparison
ansible ~]# kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system helm-install-traefik-crd-6vcw6 0/1 Completed 0 522d kube-system helm-install-traefik-9df9g 0/1 Completed 1 522d awx awx-task-5fbddc54d7-t662w 0/4 ContainerStatusUnknown 8 (156d ago) 313d awx awx-task-5fbddc54d7-rk9qn 0/4 ContainerStatusUnknown 4 153d awx awx-web-f89895997-mxpr6 0/3 ContainerStatusUnknown 7 (156d ago) 313d awx awx-operator-controller-manager-d5c594f54-z5nbz 0/2 ContainerStatusUnknown 6 (151d ago) 313d kube-system local-path-provisioner-79f67d76f8-nkqtk 1/1 Running 2 (156d ago) 522d kube-system svclb-traefik-27171d22-jgtlq 2/2 Running 0 151d awx awx-postgres-13-0 1/1 Running 0 151d awx awx-task-5fbddc54d7-7t5c6 4/4 Running 0 151d awx awx-web-f89895997-7gbv2 3/3 Running 0 151d kube-system coredns-597584b69b-vflzd 1/1 Running 2 (156d ago) 522d kube-system traefik-bb69b68cd-8pt7r 1/1 Running 2 (156d ago) 522d awx awx-operator-controller-manager-d5c594f54-2trtw 2/2 Running 14 (13d ago) 151d kube-system metrics-server-5c8978b444-6fd22 1/1 Running 2 (156d ago) 522d
The system is not under heavy load, the "kubectl Unauthorized" issue remains after reboot. It looks like maybe my .config file?
Thank you
Hello again,
As a test, I backed up the .kube/config file, I then took the certificate data from /etc/rancher/k3s/k3s.yaml file and replaced the .kube/config certificate data with it. Now my kubectl commands work fine again. It seems to have fixed the issue although I am not sure whether this is appropriate or not.
Thank you
@09cicada Hi, thanks for updating.
I then took the certificate data from /etc/rancher/k3s/k3s.yaml file and replaced the .kube/config certificate data with it.
This is the almost correct approach.
Technical background:
kubectl
in K3s is just a symbolic link. According to your logs, your kubectl
is not a symbolic link. So I guess your kubectl
was came from outside of K3s. Perhaps you (or other users) installed kubectl
separately from K3s.
$ ls -l $(which kubectl)
lrwxrwxrwx. 1 root root 3 Apr 25 00:09 /usr/local/bin/kubectl -> k3s
kubectl
in K3s is implemented to use /etc/rancher/k3s/k3s.yaml
as kubeconfig file by default, but separately installed kubectl
uses ~/.kube/config
by default. This is why your kubectl
does not work by default, but works with --kubeconfig /etc/rancher/k3s/k3s.yaml
option./usr/local/bin/kubectl
with the symbolic link to k3s
binary/etc/rancher/k3s/k3s.yaml
as ~/.kube/config
. This is the officially provided method by K3s: https://docs.k3s.io/cluster-access#accessing-the-cluster-from-outside-with-kubectl@kurokobo, Excellent, thank you much.
In the past kubectl worked so I must have updated or installed something along the way to cause the issue.
I think I will opt for the symbolic link method. I was also able to delete the rogue awx pods after you helped me to fix kubectl.
One last question, considering that we do not have a full Kubernetes environment. In your opinion, is running a full production AWX instance supporting 1000 or so hosts on K3s possible assuming we throw enough CPU/RAM at the underlying host?
Thank you
@09cicada
Also appending export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
to your .bashrc
may be possible solution for you. Refer to the details about kubeconfig files on the official kubernetes docs: https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/
One last question, considering that we do not have a full Kubernetes environment. In your opinion, is running a full production AWX instance supporting 1000 or so hosts on K3s possible assuming we throw enough CPU/RAM at the underlying host?
I have no concerns about choosing K3s, but for AWX, it depends. The frequency of job execution is a more demanding factor than the total number of hosts in the inventory. If jobs are not executed so frequently, single node K3s may possibly work if there are sufficient compute resources, but if they are executed frequently, it would be more stable to increase the number of replicas of task pod in a multi-node K3s cluster.
Environment
K3S version: v1.25.4+k3s1 (0dc63334)
Description
Hello Mr Kurokobo. I have an odd issue in that I cannot manage the AWX namespace any longer. When I run kubectl, I get this error.
error: You must be logged in to the server (Unauthorized)
Step to Reproduce
Any attempt to run kubectl -n awx get all results in the error, any attempt to pull pod logs with kubectl results in the same error. I have read a few links that point to certificate expiry but I am not sure how to troubleshoot.
Some additional info. The AWX environment still functions although very slowly and crashes intermittently. I am hoping this is something simple.
Thank you
Logs
Files