tinkerbell / playground

Example deployments of the Tinkerbell Stack for use as playground environments
Apache License 2.0
125 stars 85 forks source link

Failed with docker-compose sandbox #167

Closed gernotstarke closed 8 months ago

gernotstarke commented 1 year ago

I want to provision a few bare-metal HP desktop machines with tinkerbell.

Expected Behaviour

I follow the steps in the docker-compose sandbox, and will afterwards have a machine with a provisioned OS (Ubuntu) on it.

https://github.com/tinkerbell/sandbox/blob/main/docs/quickstarts/COMPOSE.md

Current Behaviour

All containers are started, exactly as in the COMPOSE.md expected-output-step-4.

After that, nothing happens.

Possible Solution

In the hardware.yaml, certain values remain unexplained:

  1. $TINKERBELL_CLIENT_GW: is this the gateway to the "outer" network?
  2. How shall I know the TINKERBELL_CLIENT_IP, the (Tink-) DHCP-server is supposed to create one for that machine...
  3. metadata, facility?
  4. Can I modify the hardware.yaml while the Tink-containers are running? How do I add a second machine?

All in all, the documentation left too many questions open for me. Sorry to say that.

Context

I wanted to try out the most basic step, install Ubuntu on a bare metal machine. Failed.

Your Environment

docker-compose running on MacOS (the provisioner).

5-port switch connected over USB to the provisioner. Provisioner connected over a second ethernet to DSL router/internet.

HP-EliteDesk mini PCs connected to the switch. Boot-order (in BIOS) set to 1.Netboot, 2.USB, 3. SSD

DHCP-scan on HP starts, but Tinkerbell does not provide an IP adress to the client.

apiVersion: "tinkerbell.org/v1alpha1"
kind: Hardware
metadata:
  name: machine1
spec:
  disks:
    - device: $DISK_DEVICE
  metadata:
    facility:
      facility_code: sandbox
    instance:
      hostname: "machine1"
      id: "$TINKERBELL_CLIENT_MAC"
      operating_system:
        distro: "ubuntu"
        os_slug: "ubuntu_20_04"
        version: "20.04"
  interfaces:
    - dhcp:
        arch: x86_64
        hostname: machine1
        ip:
          address: 192.168.178.230
          gateway: $TINKERBELL_CLIENT_GW
          netmask: 255.255.255.0
        lease_time: 86400
#        mac: "FC:3F:DB:05:70:CD"
        mac: "EC:8E:B5:77:9E:2D"
        name_servers: [],
        uefi: false
      netboot:
        allowPXE: true
        allowWorkflow: true
jacobweinstock commented 1 year ago

Hey @gernotstarke, thank you for trying out Tinkerbell and posting your experience here! On the surface the config all looks like it should work properly. Docker on MacOS is actually not supported though. The way Docker works on MacOS means that we won't be able to see broadcast traffic on the network. You will need a machine that has a network interface on the same layer 2 as the machine(s) to provision.

I can think of a few possible options here. You could create VM on your MacOS with a bridged network interface to your layer 2. Then setup the Tinkerbell stack on there. If you have another machine that is directly connected to the same layer 2, you could use that.

Again, thanks for trying Tinkerbell out! Apologies for the lack of and out dated documentation. We definitely have some very large gaps that need filled. We are trying our best to get our documentation updated. Thanks for your patience!

gernotstarke commented 1 year ago

thanx @jacobweinstock for your answer.

pedroalvesbatista commented 1 year ago

Same situation here, tried the compose stack, and when running KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get -n tink-system workflow sandbox-workflow --watch the tink-system doesn't exist, and also the machine can't boot at all.

I'm using Debian 11 with the latest Docker installed as well kubectl and minikube. Will try the Vagrant version to see if get any luck.

Edit: Even when reproducing Vagrant example, I got sh: 1: tink: not found when running the Postgres background command to see the workflow outputs. Tink seems to not be installed as well.

billbongo commented 1 year ago

Hello, same behaviour as @pedroalvesbatista for me. Provided an ubuntu 22-04 vm (vcenter) , with git / docker / docker-compose /kubectl installed . Netflow ok for dhcp / pxe when trying to bootstrap bare metal HP'server . the client ip has been allocated as i'm able to ping after 10 /15 mnts the new "host" But no more activities ...tried to reboot the host ..but provisionning phases restarted ..in loop

Just the 3 mandatory vars have been provided : TINKERBELL_CLIENT_IP / TINKERBELL_CLIENT_MAC / TINKERBELL_HOST_IP

I also tested to fill the hardware.yaml with the previous vars and $TINKERBELL_CLIENT_GW and resolvers used on my side.

do not have namespace tink-system in order to troubleshoot .

sandbox/deploy/stack/compose# KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get ns
NAME              STATUS   AGE
default           Active   2d
kube-system       Active   2d
kube-public       Active   2d
kube-node-lease   Active   2d
sandbox/deploy/stack/compose# KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get pods -A 
NAMESPACE     NAME                      READY   STATUS        RESTARTS   AGE
kube-system   coredns-b96499967-jmppg   1/1     Terminating   0          2d
kube-system   coredns-b96499967-cgcqk   1/1     Running       0          2d

As a suggestion maybe useful to advice people to setup kubectl ( with a version supported by the kubernetes deployed ) ?

as @gernotstarke : the step 4 of the https://github.com/tinkerbell/sandbox/blob/main/docs/quickstarts/COMPOSE.md

is different on my side

Creating network "compose_default" with the default driver
Creating compose_k3s_1                          ... done
Creating compose_fetch-osie_1                   ... done
Creating compose_fetch-and-convert-ubuntu-img_1 ... done
Creating compose_manifest-update_1              ... done
Creating compose_web-assets-server_1            ... done
Creating compose_rufio-crds-apply_1             ... done
Creating compose_tink-crds-apply_1              ... done
Creating compose_rufio_1                        ... done
Creating compose_tink-server_1                  ... done
Creating compose_tink-controller_1              ... done
Creating compose_hegel_1                        ... done
Creating compose_manifest-apply_1               ... done
Creating compose_boots_1                        ... done

Feel free to ask for more details if needed !

Thanks for your job anyway !