coreos / ignition

First boot installer and configuration tool
https://coreos.github.io/ignition/
Apache License 2.0
834 stars 245 forks source link

partition and raid udev race #1302

Open skinowski opened 2 years ago

skinowski commented 2 years ago

Bug

udev race fix https://github.com/coreos/ignition/pull/446 does not address if raid devices depend on new created partitions. Perhaps, partx needs to be run after partition creation.

Operating System Version

oracle linux 8

Ignition Version

2.9.0

Environment

bare metal hardware / oracle cloud infrastructure

Expected Behavior

We successful completion of raid1 setup using the new partitions.

Actual Behavior

Raid setup phase times out while waiting for new partition /dev/ links.

Reproduction Steps

ignition storage config for creating 4 new partitions in 2 disks and a raid1 mirror using these new partitions.

  "storage": {
    "disks": [
      {
        "device": "/dev/sda",
        "wipeTable": false,
        "partitions": [
          {
            "label": "HOME_a",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "LOG_a",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "FOO_a",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "VAR_a",
            "number": 0,
            "sizeMiB": 0,
            "typeGuid": "..."
          }
        ]
      },
      {
        "device": "/dev/sdb",
        "wipeTable": false,
        "partitions": [
          {
            "label": "HOME_b",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "LOG_b",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "FOO_b",
            "number": 0,
            "sizeMiB": 1024,
            "typeGuid": "..."
          },
          {
            "label": "VAR_b",
            "number": 0,
            "sizeMiB": 0,
            "typeGuid": "..."
          }
        ]
      }
    ],
    "raid": [
      {
        "name": "/dev/md/homefs",
        "level": "raid1",
        "devices": [
          "/dev/disk/by-partlabel/HOME_a",
          "/dev/disk/by-partlabel/HOME_b"
        ],
        "spares": 0
      },
      {
        "name": "/dev/md/varfs",
        "level": "raid1",
        "devices": [
          "/dev/disk/by-partlabel/VAR_a",
          "/dev/disk/by-partlabel/VAR_b"
        ],
        "spares": 0
      },
      {
        "name": "/dev/md/logfs",
        "level": "raid1",
        "devices": [
          "/dev/disk/by-partlabel/LOG_a",
          "/dev/disk/by-partlabel/LOG_b"
        ],
        "spares": 0
      },
      {
        "name": "/dev/md/foofs",
        "level": "raid1",
        "devices": [
          "/dev/disk/by-partlabel/FOO_a",
          "/dev/disk/by-partlabel/FOO_b"
        ],
        "spares": 0
      }
    ],
    "filesystems": [
      {
        "device": "/dev/md/varfs",
        "path": "/var",
        "format": "ext4",
        "wipeFilesystem": true,
        "label": "var"
      },
      {
        "device": "/dev/md/homefs",
        "path": "/var/home",
        "format": "ext4",
        "wipeFilesystem": true,
        "label": "home"
      },
      {
        "device": "/dev/md/logfs",
        "path": "/var/log",
        "format": "ext4",
        "wipeFilesystem": true,
        "label": "log"
      },
      {
        "device": "/dev/md/foofs",
        "path": "/var/log/foo",
        "format": "ext4",
        "wipeFilesystem": true,
        "label": "foo"
      }
    ],
   ....

Other Information

ignition fetches the config and creates partitions (using sgdisk) with no issue, but later when creating of RAID arrays, ignition fails due to the partitions not appearing in the kernel partition table:

Jan 11 17:14:07 ignition[]: Ignition 2.9.0
Jan 11 17:14:07 ignition[]: Stage: disks
Jan 11 17:14:07 ignition[]: no config dir at "/usr/lib/ignition/base.d"
Jan 11 17:14:07 ignition[]: no config dir at "/usr/lib/ignition/base.platform.d/metal"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(1): [started]  waiting for devices [/dev/sda /dev/sdb]
Jan 11 17:14:07 gnition[]: disks: createPartitions: op(1): [finished] waiting for devices [/dev/sda /dev/sdb]
Jan 11 17:14:07 ignition[]: disks: createPartitions: created device alias for "/dev/sda": "/run/ignition/dev_aliases/dev/sda" -> "/dev/sda"
Jan 11 17:14:07 ignition[]: disks: createPartitions: created device alias for "/dev/sdb": "/run/ignition/dev_aliases/dev/sdb" -> "/dev/sdb"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): [started]  partitioning "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): op(3): [started]  reading partition table of "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): op(3): [finished] reading partition table of "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): running sgdisk with options: [--pretend --new=0:0:+2097152 --typecode=... --new=0:0:+209>
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): running sgdisk with options: [--new=0:0:+2097152 --change-name=0:HOME_a --typecode=... ->
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): op(4): [started]  deleting 0 partitions and creating 4 partitions on "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:07 ignition[]: disks: createPartitions: op(2): op(4): executing: "sgdisk" "--new=0:0:+2097152" "--change-name=0:HOME_a" "--typecode=...>
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(2): op(4): [finished] deleting 0 partitions and creating 4 partitions on "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(2): [finished] partitioning "/run/ignition/dev_aliases/dev/sda"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): [started]  partitioning "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): op(6): [started]  reading partition table of "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): op(6): [finished] reading partition table of "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): running sgdisk with options: [--pretend --new=0:0:+2097152 --typecode=... --new=0:0:+209>
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): running sgdisk with options: [--new=0:0:+2097152 --change-name=0:HOME_b --typecode=... ->
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): op(7): [started]  deleting 0 partitions and creating 4 partitions on "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:08 ignition[]: disks: createPartitions: op(5): op(7): executing: "sgdisk" "--new=0:0:+2097152" "--change-name=0:HOME_b" "--typecode=...>
Jan 11 17:14:09 ignition[]: disks: createPartitions: op(5): op(7): [finished] deleting 0 partitions and creating 4 partitions on "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:09 ignition[]: disks: createPartitions: op(5): [finished] partitioning "/run/ignition/dev_aliases/dev/sdb"
Jan 11 17:14:09 ignition[]: disks: createRaids: op(8): [started]  waiting for devices [/dev/disk/by-partlabel/HOME_a /dev/disk/by-partlabel/HOME_b /dev/disk/by-partlabel/VAR_a /dev/di>
Jan 11 17:15:40 ignition[]: disks: createRaids: op(8): [failed]   waiting for devices [/dev/disk/by-partlabel/HOME_a /dev/disk/by-partlabel/HOME_b /dev/disk/by-partlabel/VAR_a /dev/di>

sgdisk output at the time of failure:

# sgdisk -p /dev/sda
...

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          262143   127.0 MiB   EF00  EFI-SYSTEM
   2          262144         1048575   384.0 MiB   EA00  BOOT
   3         1048576        17825791   8.0 GiB     8304  ROOT
   4        17825792        19922943   1024.0 MiB  FD00  HOME_a
   5        19922944        22020095   1024.0 MiB  FD00  LOG_a
   6        22020096        24117247   1024.0 MiB  FD00  FOO_a
   7        24117248       468862094   212.1 GiB   FD00  VAR_a

# sgdisk -p /dev/sdb
...

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          262143   127.0 MiB   EF00  EFI-SYSTEM
   2          262144         1048575   384.0 MiB   EA00  BOOT
   3         1048576        17825791   8.0 GiB     8304  ROOT
   4        17825792        19922943   1024.0 MiB  FD00  HOME_b
   5        19922944        22020095   1024.0 MiB  FD00  LOG_b
   6        22020096        24117247   1024.0 MiB  FD00  FOO_b
   7        24117248       468862094   212.1 GiB   FD00  VAR_b

device tree at time of failure:

# ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sda3  /dev/sdb  /dev/sdb1  /dev/sdb2  /dev/sdb3

/proc/partitions

# cat /proc/partitions
major minor  #blocks  name

   8        0  234431064 sda
   8        1     130048 sda1
   8        2     393216 sda2
   8        3    8388608 sda3
   8       16  234431064 sdb
   8       17     130048 sdb1
   8       18     393216 sdb2
   8       19    8388608 sdb3

partx is able to update the partition table:

# partx -u /dev/sda
# cat /proc/partitions
major minor  #blocks  name

   8        0  234431064 sda
   8        1     130048 sda1
   8        2     393216 sda2
   8        3    8388608 sda3
   8        4    1048576 sda4
   8        5    1048576 sda5
   8        6    1048576 sda6
   8        7  222372423 sda7
bgilbert commented 2 years ago

Thanks for all the context and logs. Since the problem doesn't correct itself until a manual partx -u, the problem likely isn't a race in Ignition (where we're simply not waiting long enough), but instead, something in the sgdisk -> kernel -> udev chain is not properly handling the reread.

Things you could try:

bgilbert commented 2 years ago
<jakinney> bgilbert: quick update on the issue you helped me look at the
other day.  This looks like it's related to boot/root being host on md raids
prior to Ignition beginning.
<jakinney> In this situation, the kernel won't pick up newly created
partitions without help from e.g. partx.  Even though sgdisk calls ioctl()
with BLKRRPART.
<jakinney> Md must be interfering in some way.
<bgilbert> jakinney: yeah, that makes sense.  the Ignition disks stage
assumes it has exclusive control of any disk being modified.
<bgilbert> jakinney: is the RAID left over from a previous provisioning run,
or are you shipping images that have RAIDs already created?
pothos commented 12 months ago

Even without a raid setup something seems missing/broken in sgdisk. When creating a new partition on the boot device, it won't be recognized until partprobe /dev/vda. Maybe we should add this command (or partx -u) to internal/sgdisk/sgdisk.go's Commit() function.