gardener / gardener-extension-provider-gcp

Gardener extension controller for the GCP cloud provider (https://cloud.google.com).
https://gardener.cloud
Apache License 2.0
13 stars 82 forks source link

Using data disk for workers does not work for some OSes #323

Closed timuthy closed 3 years ago

timuthy commented 3 years ago

How to categorize this issue?

/area os /kind bug /priority 3 /platform gcp

What happened: Worker nodes with data disks cannot join the cluster on GCP with Gardenlinux.

What you expected to happen: The worker to join the cluster successfully.

How to reproduce it (as minimally and precisely as possible): Create a GCP shoot with Gardenlinux 318.8 and the following disk configuration (shoot.yaml):

        volume:
          type: pd-standard
          size: 20Gi
        dataVolumes:
          - name: kubelet-dir
            type: pd-standard
            size: 50Gi
        kubeletDataVolumeName: kubelet-dir

Anything else we need to know?: The issue is caused by the fact that the GCP extensions creates a MachineClass which configures the data disk from the same image as the boot disk:

  disks:
  - autoDelete: true
    boot: true
    image: projects/name/global/images/gardenlinux-gcp-cloud-gardener--prod-318-8
    labels:
      name: seed-gcp66
    sizeGb: 20
    type: pd-standard
  - autoDelete: true
    boot: false
    image: projects/name/global/images/gardenlinux-gcp-cloud-gardener--prod-318-8
    labels:
      name: seed-gcp66
    sizeGb: 50
    type: pd-standard

As a result, the system runs into two independent problems:

  1. The data disk cannot be formatted by format-data-device() because the disk already contains partitions and filesystems (we don't set mkfs.ext4 -F):
Sep 09 06:24:52 shoot--d066080--seed-gcp66-cpu-worker-z1-78f99-88fkz download-cloud-config.sh[1506]: mke2fs 1.45.7 (28-Jan-2021)
Sep 09 06:24:52 shoot--d066080--seed-gcp66-cpu-worker-z1-78f99-88fkz download-cloud-config.sh[1506]: Found a gpt partition table in /dev/sdb
Sep 09 06:24:52 shoot--d066080--seed-gcp66-cpu-worker-z1-78f99-88fkz download-cloud-config.sh[1506]: Proceed anyway? (y,N)
  1. Gardenlinux works with filesystem LABELs only and creating another disk form the very same image causes the LABELs to be duplicated:
$ lsblk -o name,mountpoint,label,size
NAME   MOUNTPOINT LABEL  SIZE
sda                       20G
├─sda1                     1M
├─sda2            EFI     16M
├─sda3            USR      1G
└─sda4            ROOT  1006M
sdb                       50G
├─sdb1                     1M
├─sdb2 /boot/efi  EFI     16M
├─sdb3 /usr       USR      1G
└─sdb4 /          ROOT    49G

As a consequence, the data device is sporadically used as a boot device.

image

Environment:

timuthy commented 3 years ago

/cc @guydaichs