siderolabs / omni-feedback

Omni feature requests, bug reports
https://www.siderolabs.com/platform/saas-for-kubernetes/
MIT License
2 stars 0 forks source link

[feature] Display persistent block device names for machines #54

Open sjdrc opened 1 year ago

sjdrc commented 1 year ago

Problem Description

image

Omni always displays disks with identifiers like /dev/sda and /dev/nvme0n1. I've noticed that this also makes it into the generated node configuration.

Solution

Switch to displaying unique disk identifiers instead (i.e. /dev/disk/by-id/*)? This will reduce confusion and the possibility of installing onto the incorrect disk.

Alternative Solutions

No response

Notes

No response

mj-sakellaropoulos commented 1 year ago

We also just ran in to this and wasted quite some time since we were installing from a USB. When USB was removed sdb becomes sda and the node can no longer boot due to a config validation error at startup.

To workaround, we wanted to try installing with disk/by-id, the problem is talosctl command does not report IDs at all:

$ ./talosctl-windows-amd64.exe disks --nodes <node> --cluster <cluster>
NODE           DEV        MODEL              SERIAL   TYPE   UUID   WWID   MODALIAS      NAME   SIZE     BUS_PATH                                                               SUBSYSTEM          SYSTEM_DISK
10.214.96.81   /dev/sda   DataTraveler 3.0   -        HDD    -      -      scsi:t-0x00   -      16 GB    /pci0000:00/0000:00:14.0/usb3/3-6/3-6:1.0/host0/target0:0:0/0:0:0:0/   /sys/class/block
10.214.96.81   /dev/sdb   WDC WD15EADS-00P   -        HDD    -      -      scsi:t-0x00   -      1.5 TB   /pci0000:00/0000:00:01.0/0000:01:00.0/host1/target1:0:0/1:0:0:0/       /sys/class/block   *

Ended up having to use this command to get a mapping between IDs and sd*

$ ./talosctl-windows-amd64.exe ls -t L -l dev/disk/by-id --nodes talos-qfn-c8s --cluster cedille-metal-02
NODE           MODE         UID   GID   SIZE(B)   LASTMOD           NAME
[...]
10.214.96.81   Lrwxrwxrwx   0     0     9         Sep 10 18:01:29   scsi-350014ee001c5ae7b -> ../../sdb
[...]
10.214.96.81   Lrwxrwxrwx   0     0     9         Sep 10 18:01:29   wwn-0x50014ee001c5ae7b -> ../../sdb
[...]

We are going to try using these IDs in the MachineConfig and retry installation tommorow.