Closed Mawwlle closed 3 years ago
The namespace of the controllers has changed since then. Sorry that it wasn't updated yet. We are working right now on updating our docs.
So, the right namespace would be capm3-system
. See the example output :
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-6b6579d56d-7l4pl 2/2 Running 0 3d1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-6d878bb599-r77cc 2/2 Running 0 3d1h
capi-system capi-controller-manager-7ff4999d6c-jmjk9 2/2 Running 0 3d1h
capi-webhook-system capi-controller-manager-6c48f8f9bb-n84dz 2/2 Running 0 3d1h
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-56f98bc7f9-6xjgf 2/2 Running 0 3d1h
capi-webhook-system capi-kubeadm-control-plane-controller-manager-85bcfd7fcd-k4j59 2/2 Running 0 3d1h
capi-webhook-system capm3-controller-manager-7695bc4f6d-lr667 2/2 Running 0 3d1h
capi-webhook-system capm3-ipam-controller-manager-6b6f6d44d7-gxks4 2/2 Running 0 3d1h
capm3-system capm3-baremetal-operator-controller-manager-844bc955dc-k6f8w 2/2 Running 0 3d1h
capm3-system capm3-controller-manager-75c7b8fcc8-ngc2f 2/2 Running 0 3d1h
capm3-system capm3-ipam-controller-manager-77d89bfc98-bnsgg 2/2 Running 0 3d1h
cert-manager cert-manager-cainjector-fc6c787db-bwt9w 1/1 Running 0 3d1h
cert-manager cert-manager-d994d94d7-gfn2d 1/1 Running 0 3d1h
cert-manager cert-manager-webhook-845d9df8bf-q2dqw 1/1 Running 0 3d1h
kube-system coredns-f9fd979d6-fm9ss 1/1 Running 0 3d1h
kube-system coredns-f9fd979d6-wnxj2 1/1 Running 0 3d1h
kube-system etcd-kind-control-plane 1/1 Running 0 3d1h
kube-system kindnet-qgwbc 1/1 Running 0 3d1h
kube-system kube-apiserver-kind-control-plane 1/1 Running 0 3d1h
kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 3d1h
kube-system kube-proxy-9c2nh 1/1 Running 0 3d1h
kube-system kube-scheduler-kind-control-plane 1/1 Running 0 3d1h
local-path-storage local-path-provisioner-78776bfc44-wk8vn 1/1 Running 0 3d1h
Can you please paste the output of kubectl get machine -A
?
kubectl get machine -A
However, after the "kubectl get baremetalhosts -n metal3" command, it was expected
But we get
P.S while I write this worker node started working
But now I have two questions:
1. Is it okay for me to ssh after this message "metal3@192.168.111.249: Permission denied (publickey, gssapi-keyex, gssapi-with-mic)"?
2. Could these problems be due to Docker's new policy? If so, do you know how this can be configured?
kubectl get machine -A
However, after the "kubectl get baremetalhosts -n metal3" command, it was expected
But we get P.S while I write this worker node started working
But now I have two questions:
- Is it okay for me to ssh after this message "metal3@192.168.111.249: Permission denied (publickey, gssapi-keyex, gssapi-with-mic)"?
- Could these problems be due to Docker's new policy? If so, do you know how this can be configured?
I think you can't ssh into the node until it is not provisioned. Can you please check if you have any issues in the console output?
sudo virsh console node_0
kubectl get machine -A
However, after the "kubectl get baremetalhosts -n metal3" command, it was expected But we get P.S while I write this worker node started working But now I have two questions:
- Is it okay for me to ssh after this message "metal3@192.168.111.249: Permission denied (publickey, gssapi-keyex, gssapi-with-mic)"?
- Could these problems be due to Docker's new policy? If so, do you know how this can be configured?
I think you can't ssh into the node until it is not provisioned. Can you please check if you have any issues in the console output?
sudo virsh console node_0
Log out after sudo virsh console node_0
log.txt
test1-7fcn2 login:
What do you need to enter in this field?(Appeared during command execution, I entered a random name and everything froze for me)
After that my nodes were suspended, how can I start them again?
By default, we don't set a username & password for target nodes in Metal3-dev-env scripts. It should work with ssh, because your host's ssh key will be injected to the target nodes. It seems there are some other issues.
What is the current PROVISIONING_STATUS
of your BareMetalHosts? if not Provisioned
,
export CONTAINER_RUNTIME=podman
if you running on CentOS. Then baremetal node list
btw, thanks for the logs. But I didn't see anything in there that would indicate the reason for your issue.
By default, we don't set a username & password for target nodes in Metal3-dev-env scripts. It should work with ssh, because your host's ssh key will be injected to the target nodes. It seems there are some other issues.
What is the current
PROVISIONING_STATUS
of your BareMetalHosts? if notProvisioned
,
- can you check Baremetal Operator (pod) logs?
- can you check Ironic nodes status? For that you need first to
export CONTAINER_RUNTIME=podman
if you running on CentOS. Thenbaremetal node list
I attach as many different logs as possible so that you have more information, thanks for you responsiveness log(pod).txt bm_node_list.txt-Provisioning State "wait call-back" lasts a very long time some_commands_inf.txt Also a common mistake is the lack of free space on the device. We have 200 gigabytes of storage, as I understand it, this is running out of space in inodes. How fix it?
Thank for the logs! Pasting some outputs here for visibility.
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-7ffb7c9d77-rc8sf 2/2 Running 0 31m
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-5b8cf46bb6-8dmpq 2/2 Running 0 31m
capi-system capi-controller-manager-559db48f6-zhhkn 2/2 Running 0 31m
capi-webhook-system capi-controller-manager-76d9b5889c-c5kpm 2/2 Running 0 31m
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-787cb85f58-lqfxd 2/2 Running 0 31m
capi-webhook-system capi-kubeadm-control-plane-controller-manager-86c44777c5-blwnm 2/2 Running 0 31m
capi-webhook-system capm3-controller-manager-7695bc4f6d-s2mgp 2/2 Running 0 31m
capi-webhook-system capm3-ipam-controller-manager-6b6f6d44d7-bstlr 2/2 Running 0 31m
capm3-system capm3-baremetal-operator-controller-manager-844bc955dc-mt4h6 2/2 Running 0 31m
capm3-system capm3-controller-manager-75c7b8fcc8-s6zpg 2/2 Running 0 31m
capm3-system capm3-ipam-controller-manager-77d89bfc98-p7fss 2/2 Running 0 31m
cert-manager cert-manager-cainjector-fc6c787db-qh9rw 1/1 Running 0 31m
cert-manager cert-manager-d994d94d7-b8p6k 1/1 Running 0 31m
cert-manager cert-manager-webhook-845d9df8bf-rrmgm 1/1 Running 0 31m
kube-system coredns-f9fd979d6-qj6mn 1/1 Running 0 33m
kube-system etcd-minikube 1/1 Running 1 30m
kube-system kube-apiserver-minikube 1/1 Running 1 30m
kube-system kube-controller-manager-minikube 1/1 Running 1 30m
kube-system kube-proxy-mhbhf 1/1 Running 0 33m
kube-system kube-scheduler-minikube 1/1 Running 1 31m
kube-system storage-provisioner 1/1 Running 1 33m
metal3 metal3-ironic-6fbb965956-sgtcx 9/9 Running 0 30m
$ baremetal node list
+--------------------------------------+--------+--------------------------------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------+--------------------------------------+-------------+--------------------+-------------+
| 3c757597-07ae-48ea-af03-fdbf4a07a09d | node-0 | 20ff305c-d488-46ff-a835-8455d375da80 | power on | active | False |
| 46e553ca-5cb0-44af-b3cd-1bb5aab8ad3c | node-1 | af3f80d1-74ec-4458-9059-1e3fed4ba7fe | power on | wait call-back | False |
+--------------------------------------+--------+--------------------------------------+-------------+--------------------+-------------+
$ kubectl get bmh -n metal3
NAME PROVISIONING_STATUS CONSUMER ONLINE ERROR
node-0 provisioned test1-controlplane-hl2kz true
node-1 provisioning test1-workers-jgzl7 true
$ kubectl get machine -A
NAMESPACE NAME PROVIDERID PHASE VERSION
metal3 test1-759cfc77c5-nbnpj Provisioning v1.18.8
metal3 test1-d54lk metal3://20ff305c-d488-46ff-a835-8455d375da80 Running v1.18.8
$ sudo virsh net-dhcp-leases baremetal
Expiry Time MAC address Protocol IP address Hostname Client ID or DUID
-----------------------------------------------------------------------------------------------------------
2020-12-09 13:15:11 00:be:62:08:82:10 ipv4 192.168.111.20/24 node-0 01:00:be:62:08:82:10
2020-12-09 13:12:39 00:be:62:08:82:14 ipv4 192.168.111.21/24 node-1 01:00:be:62:08:82:14
2020-12-09 13:13:05 52:54:00:1e:22:b4 ipv4 192.168.111.59/24 minikube 01:52:54:00:1e:22:b4
I see some error output in your logs related to the disk space:
go: creating work dir: mkdir /tmp/go-build164395946: no space left on device
. I assume this is coming from your host where you are running Metal3-dev-env. Can you please check if disk space usage is close to 100% ? In our CI, we are creating VM with 100GB disk space.
Also, I see that Ironic node-1 is in wait-call-back, which indicates that Ironic conductor is waiting for ramdisk to boot. See the Ironic state machine: https://docs.openstack.org/ironic/rocky/contributor/states.html
My assumption is that, lack of space in your host. But that would be more clear if you could check the host disk space.
Thanks, it helped!
System: 6c CPUs 20gb ram CentOS 8(default metal3 architecture);
We need to deploy a standard metal environment.
However, when creating a cluster, a working node is not created. If you run this command, then "capm3 do not exist in namespaces"
What could be the reason?