tinkerbell / playground

Example deployments of the Tinkerbell Stack for use as playground environments
Apache License 2.0
126 stars 85 forks source link

STATE_PENDING with kubernetes (helm) sandbox #178

Closed howard-yeh closed 8 months ago

howard-yeh commented 8 months ago

I tried to provision one x86 server with Tinkerbell Host. I tried different tink image tag versions, but the workflow always stuck in the "STATE_PENDING" state. On the tinkerbell Host, the workflow showed NAME TEMPLATE STATE levono-x86-workflow ubuntu-focal-sda-x86-testing STATE_PENDING

Please, I need some help for solving this problem.🙏🏼

Environment

TINKERBELL HOST OS: Ubuntu 20.04 Tinkerbell host and the x86 server are in the same network segementation (192.168.x.x/16) I follow the steps in the helm chart (kubernetes): https://github.com/tinkerbell/charts

Current Behaviour

1. Deploy Tinkerbell compoents

kubectl get pod -n tink-system

NAME                               READY       STATUS     RESTARTS      AGE
boots-6857b44df6-f52st             1/1         Running      0           48m
hegel-858b7b4d8d-6tfft             1/1         Running      0           48m
rufio-768b8c6584-jzvpg             1/1         Running      0           48m
tink-controller-6cb5c85458-wh7fb   1/1         Running      0           48m
tink-server-56c45857bf-qc8rg       1/1         Running      0           48m
tink-stack-7fcd6546f8-mqw67        1/1         Running      0           48m
tink-stack-relay-65cfdf9bbb-n4989  1/1         Running      0           48m

2. Set x86 server ipxe mode boot and reboot server.

3. Check the workflow:

NAME                 TEMPLATE                        STATE
levono-x86-workflow  ubuntu-focal-sda-x86-testing   STATE_PENDING

4. Hook successfully (boots)

ipxe: net0: 192.168.154.65/255.255.0.0 gw 192.168.0.1
ipxe: Filename: http://192.168.154.63/auto.ipxe
ipxe: auto.ipxe : 611 bytes [script]
ipxe: http://192.168.154.63/auto.ipxe... ok
ipxe: Loading the Tinkerbell Hook iPXE script...
ipxe: http://192.168.154.71:8000/vmlinuz-x86_64...... ok
ipxe: http://192.168.154.71:8000/initramfs-x86_64... ok

5. x86 server entered LinuxKit:

Welcom to LinuxKit!
NOTE: This system is namespaced.
The namespace you are currently in may not be the root.
System services are namespaced: to access, use `ctr -n services.linuxkit ...`
(ns: getty) machine1:~# [   37.799935] IPVS: ftp: loaded support on port[0] = 21
[    37.81.7233] Initializing XFRM netlink socket
[    38.875169] ICMPv6: process `dhcpd' is using deprecated sysctl (syscall) net.ipv6.neigh.eth4.retrans_time - use net.ipv6.neigh.eth4.retrans_time_ms instead
**(Stuck here)**

Possible Solution

Echok3 commented 8 months ago

Have you ever tried docker-compose from sandbox?

howard-yeh commented 8 months ago

Have you ever tried docker-compose from sandbox?

Nope. I only tried kubernetes approach.

howard-yeh commented 8 months ago

After tracing the code and logs, I discovered that my tinkerbell-client was unable to download the tink-worker image via docker. The reason for this was due to the air-gap environment, which does not allow connections to the public network. As a result, I manually modified the /etc/hosts file using 'docker-shell'. Question here: Is it possible to add a command to the docker-hook before pulling down the image, such as modifying the /etc/hosts file? Reference: https://github.com/tinkerbell/hook

jacobweinstock commented 8 months ago

Hey @howard-yeh. Don't know if you've done this but, in the Helm chart you can modify the image that is used for the tink worker via the smee.tinkWorkerImage, ref.

chrisdoherty4 commented 8 months ago

No action needed.