ChameleonCloud / chi-in-a-box

Packaging the systems and operations of the Chameleon testbed
Apache License 2.0
15 stars 10 forks source link

How to add a Zun compute node #236

Open samiemostafavi opened 2 years ago

samiemostafavi commented 2 years ago

Hi,

I am running chi-in-a-box on Ubuntu 20.04. I have zun_compute_k8s running with k3s service. How can I add a worker node for running the containers? (Zun compute node) The worker nodes are Dell servers located inside the testbed network. They are connected to public network and internal network.

Is this document the way to go? https://docs.openstack.org/zun/latest/install/compute-install.html

I also found some documents on running the k3s service on the worker side: https://rancher.com/docs/k3s/latest/en/quick-start/

Should I run both of them?

Apart from that, an entry must be created in Blazar, so the node is reservable by users. Are there any instructions on that too?

Best, Samie

msherman64 commented 2 years ago

For chi@edge v2, the compute nodes are kubernetes. You only need to run the k3s instructions to add a worker node. https://rancher.com/docs/k3s/latest/en/quick-start/

As this is still an early access version, the docs for blazar aren't quite there yet. You can allow launching containers without a reservation by doing the following:

samiemostafavi commented 2 years ago

Thanks for your answer. I added allow_without_reservation = True and could confirm that it is set in the /etc/zun/zun.conf in zun_compute_k8s container.

On the other hand I can see that the worker node is registered into the cluster:

kubectl get nodes -o wide
NAME      STATUS   ROLES                  AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
edge-mv   Ready    control-plane,master   46h   v1.22.5+k3s1   192.168.9.3   10.0.87.20    Ubuntu 20.04.5 LTS   5.4.0-126-generic   containerd://1.5.8-k3s1
client    Ready    <none>                 65m   v1.24.6+k3s1   192.168.2.2   <none>        Ubuntu 20.04.5 LTS   5.4.0-126-generic   containerd://1.6.8-k3s1

However when I create a container in the webportal, it fails and gives the following:

Status: Error
-
Status Reason
There are not enough hosts available.

Any idea or solution?

P.S. I am suspicious to the kubernetes IPs. How should INTERNAL-IP and EXTERNAL-IP be set with respect to Openstack's public network and internal network? Is it important?

Best, Samie