Open sarg3nt opened 7 months ago
I should add that these machines are being created from AuroraBoot, not sure that matters.
can you do an lsblk
on /dev/sda
on the second (failed) node? Also since you have debug: true
in the config, can you also attach the installation logs of both machines?
Hi @jimmykarily
I think I might have figured this out somewhat.
After building a few more clusters I discovered some more oddities. My labels SEL_disk1
and SEL_disk2
where sometimes getting applied to the opposite devices they were supposed to. Turns out that wasn't really the problem.
The problem is that /dev/sda
, /dev/sdb
, etc. are no longer guaranteed to be assigned to the first, second, etc. devices in the SCSI chain. They were never actually guaranteed but where fairly consistent until recent kernels switched to asynchronous device scanning. Now device names can be assigned to random hardware ID's, so /dev/sda
is not always going to be the first "disk" in the system.
See these posts for more:
https://access.redhat.com/solutions/3962551
which references
https://www.spinics.net/lists/linux-scsi/msg166873.html
You'll have to click through the replies to get the full picture.
Maybe you already knew all of this, but it's news to me. :)
So, I changed my mount_disk.sh
code to this.
#!/bin/bash
# Copyright (c) 2024 Schweitzer Engineering Laboratories, Inc.
# SEL Confidential
set -euo pipefail
IFS=$'\n\t'
# cSpell:ignore
# This script is ran in the cloud_init.yaml and not in the Dockerfile so it must remain in the target container image.
make_directory() {
local directory="${1-}"
if [[ -d "$directory" ]]; then
log " The $directory directory already exists, skipping creation" "cyan"
else
log " Creating the $directory directory" "green"
mkdir -p "$directory"
fi
}
# Format /dev/$disk if not already formatted.
format_disk() {
local disk="${1-}"
log " Status before mount" "cyan"
ls -l "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
if [[ ! $(lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0") = *"ext4"* ]]; then
log " Formatting disk ${disk}" "green"
mkfs.ext4 -L "SEL_${disk}" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
log " Status after mount" "cyan"
ls -l "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
else
log " Disk $disk is already formatted" "cyan"
fi
}
# Mount /dev/$disk to /rke and create the sub directorires.
mount_disk() {
local disk="${1-}"
local directory="${2-}"
local owner="${3-}"
local extra_directories="${4-}"
log " Status before mount" "cyan"
ls -l "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
log " Mounting disk $disk to $directory" "green"
mount -o rw --source "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0" "$directory"
log " Status after mount" "cyan"
ls -l "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:${disk}:0"
if [[ -n "$extra_directories" ]]; then
IFS=","
for new_directory in $extra_directories; do
if [[ -d "$new_directory" ]]; then
log " The $new_directory directory already exists, skipping creation" "cyan"
else
log " Creating the $new_directory directory" "green"
mkdir -p "$new_directory"
fi
done
IFS=$'\n\t'
fi
if [[ -n "$owner" ]]; then
log " Setting ${owner} as the owner of ${directory} recursively" "green"
chown -R "${owner}:${owner}" "${directory}"
fi
}
main() {
source "/usr/bin/lib/sh/log.sh"
local option="${1-}"
local disk="${2-}"
local directory="${3-}"
local owner="${4-}"
local extra_directories="${5-}"
log "Running mount_disk.sh with option $option for disk $disk in directory $directory" "blue"
case "$option" in
"make_directory")
make_directory "$directory"
;;
"format_disk")
format_disk "$disk"
;;
"mount_disk")
mount_disk "$disk" "$directory" "$owner" "$extra_directories"
;;
esac
}
# Run main
if ! (return 0 2> /dev/null); then
(main "$@")
fi
As you can see I'm now referencing the /dev/disk/by-path/pci-0000:03:00.0-scsi.....
hardware ID, which seems to be stable for our VSphere setup.
So now I should be able to control disks 1 and 2 with confidence, but I don't know if that will 100% solve the Kairos not using disk 0 issue.
I think you guys said if the devices have labels it should work fine, but I'm not sure how this is supposed to work as the format process happens later in the install cycle? Or am I wrong there?
i.e. I'm formatting the disk and thus assigning the label in the after-install-chroot
stage, is that before Karios grabs a disk to install? If not, how do we make sure it gets the first first physical disk.
For reference the disk paths are
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0 # First physical disk, Karios should use this.
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0 # Second physical disk, I mount this to /var/lib/rancher/rke2
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0 # Third physical disk, I mount this to /var/lib/rancher/longhorn
Ohhh, I just realized that
install:
device: "/dev/sda"
Should accept
install:
device: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
Yes/No?
I ran out of time tonight so I'll work on testing this tomorrow and let you know what I find.
To answer the question as to weather AuroraBoot will allow
install:
device: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
The answer is no. It fails validation.
Kairos Version: 9-core-amd64-generic-v2.4.3
2024-02-14 17:51:45 Target OSs /etc/systemd/system/cloud_init.yaml does not pass validation. Quitting.
2024-02-14 17:51:45 jsonschema: '/install/device' does not validate with file:///schema.json#/properties/install/$ref/properties/device/pattern: does not match pattern '^(auto|/|(/[a-zA-Z0-9_-]+)+)$'
I set strict: false
and it still runs validation and won't run.
Is this fixable?
I've built two more test clusters and so far all the right physical disks ended up attached to the right directories. Kairos on disk 0, RKE2 on Disk 1 and Lonhorn on 2.
I'll keep testing and let you know if it failures.
sda
, sdb
, and sdc
are moving around so we can't trust those devices any more to be on the correct disks but once you know that I think it's OK
I think it would still be nice to get
install:
device: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
working too though.
Telling kairos to install to /dev/sda
is virtually useless now.
My recommendation for a "fix" are the following.
device: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
in the cloud_init.yaml
file/dev/sdx
and explain how and where to find the path to their physical device and how to use it.@jimmykarily any thoughts on the above?
@jimmykarily any thoughts on the above?
Choosing disks by label/id/path/etc is not yet supported (it has been discussed before). What you are describing is a valid use case and I think the only workaround for now would be to use no-format
(docs):
# no-format: true skips any disk partitioning and formatting
# If set to true installation procedure will error out if expected
# partitions are not already present within the disk.
no-format: true
and do the partitioning completely manually using some script in a cloud config. @kairos-io/maintainers what would be the right stage to do the partitioning?
@jimmykarily I have an update on this and it is really really weird. I'll try to keep in short.
boot
at 80 gig, rke2
at 40 gig, longhorn
at 80gigI'm flummoxed. Is this a requirement of some kind or a bug?
@jimmykarily I've stripped down our config as much as possible, including now mounting disks 2 and 3 and as little config in the cloud_init.yaml
files as possible and still get the results above. If the second or third disks are bigger than Kairos install disk Kairos will install, reboot, then hand at startup. This is with building the boot disk manually. If I don't do that and let it auto assign and build itself then it gets into an infinite install loop.
Since this is a separate issue from this request I'll open a new ticket as a defect.
I created the new issue after doing some more investigation. https://github.com/kairos-io/kairos/issues/2281
Kairos version:
/kairos/rockylinux:9-core-amd64-generic-v2.4.3
CPU architecture, OS, and Version:
Describe the bug
Hello Kairos team. I'm running into an old issue again. I thought we got this solved by adding volume labels to my other disks but looks like not.
I have three disks in my VM, sda, sdb, sdc
The
cloud_init.yaml
isThe
mount_disk.sh
file is:The log output of mount_disk.sh on the working nodes is
The log output of mount_disk.sh on the broken node is
This feels like a race condition.
What is the point of
If it's going to ignore it?
Any help?
To Reproduce See above config
Expected behavior All nodes should use the volume specified int he 'cloud_init.yaml' file.