OpenNebula / one-apps

Toolchain to build OpenNebula appliances
Apache License 2.0
12 stars 12 forks source link

OneKE - Storage nodes misconfigured if using custom or cloned Image #131

Open alpeon opened 6 days ago

alpeon commented 6 days ago

Bug Info

Description:

If VM Template of a storage node was altered to use custom Image(or clone of the original image) as a second mounted drive - the VM never finishes its configuration and not joining the k8s cluster.

 2/3 Configuration step is in progress...

 * * * * * * * *
 * PLEASE WAIT *
 * * * * * * * *`

Some of the symptoms:

Affected OneKE versions are both 1.29 and 1.27

Important note

Please note that the behaviour is going to differ whether second network (private) is isolated or not. But result is going to be the same - storage node is misconfigured and not joined the cluster!

If Private Network isolated:

If Private Network is routable (easiest way - hook both networks to the same Vnet):

Steps to reproduce:

  1. install miniONE.
  2. Import OneKE 1.29 service.
  3. Clone the default image that is used as a second disk for storage role nodes. (typically: Service OneKE 1.29-storage-2-*-1 and is 10G in size)
  4. Change the Service OneKE 1.29-storage-2 VM Template to use the cloned Image as a second disk instead of a default one.
  5. Instantiate service using the preferred method and make sure that k8s environment is running as expected (set enable traefik, longhorn, dns, route, NAT)
  6. Scale the storage role by changing its cardinality to 1

Result:

The storage VM can't finish its configuration thus not added to the cluster. Service stuck in Scaling state.

Workaround:

You can resize the disk after the VM is up and set the desired value.

rsmontero commented 3 days ago

This seems to be related to a wrong handling of some vars in OneFlow, @alpeon can you attach the associated OneFlow document