Closed sarg3nt closed 6 months ago
@sarg3nt thanks for all your debugging efforts.
I've been using a cloud-config similar to the stripped one you sent all the time with no problems. The only difference is that I almost never have a second disk attached. If that's the problem it should be easy to reproduce in qemu with 2 disks.
I'd suggest you take the minimum config that reproduces the problem (the last one you sent), remove the reboot: true
so that you can grab the installation logs for inspection. We should be able to see if any errors were logged and which device the installation was performed onto. Running kairos-agent manual-install --device auto config.yaml
would be even better in terms of log collection.
Also, the system boots from an ISO right? And there is a reboot: true
there which will make the system reboot after installation. Do you have the boot order correctly set so that the system doesn't boot from the cdrom again?
@jimmykarily I'm using AuroraBoot , so not booting from a CD ROM
I think I've somewhat figured some of this out.
The root problem ties back to ttps://github.com/kairos-io/kairos/issues/2243
Even though I'm using
no-format: true
and creating the Kairos disks manually device: auto
or device: /dev/sda
is taking priority and breaking it.
device: /dev/sda
it will work if sda
points to the first scsi device, which happens most, but not all of the time.device: auto
it almost always defaults to the larger disk, which is not the bookable disk and thus fails. device
statement then I think it defaults to auto, so same result.Example: This is with device: auto
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 528.5M 1 loop /run/rootfsbase
sda 8:0 0 80G 0 disk
sdb 8:16 0 120G 0 disk
├─sdb1 8:17 0 1M 0 part
├─sdb2 8:18 0 64M 0 part
├─sdb3 8:19 0 3.1G 0 part
├─sdb4 8:20 0 5.3G 0 part
└─sdb5 8:21 0 111.6G 0 part
Where sdb
is my larger second disk`
Here's my yaml
file where I manually format the partitions.
strict: true
# enable debug logging
debug: true
install:
no-format: true
auto: true
poweroff: false
reboot: false
grub_options:
extra_cmdline: "rd.immucore.debug"
users:
- name: "kairos"
passwd: "kairos"
stages:
kairos-install.pre.before:
- if: '[ -e "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" ]'
name: "Create partitions"
commands:
- |
parted --script --machine -- "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" mklabel msdos
layout:
device:
path: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
expand_partition:
size: 0 # All available space
add_partitions:
# all sizes bellow are in MB
- fsLabel: COS_OEM
size: 64
pLabel: oem
- fsLabel: COS_RECOVERY
size: 8500
pLabel: recovery
- fsLabel: COS_STATE
size: 18000
pLabel: state
- fsLabel: COS_PERSISTENT
pLabel: persistent
size: 25000
filesystem: "ext4"
boot:
- systemd_firstboot:
keymap: us
So . .even though I'm specifically setting
path: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
Kairos appears to be ignoring it and installing on whatever disk it wants
Here's some troubleshooting output so you get a lay of the land.
[root@lpul-vault-k8s-server-0 kairos]# blkid
/dev/loop0: TYPE="squashfs"
/dev/sdb4: LABEL="COS_STATE" UUID="b78e5bd5-98a3-45aa-a18e-5ece1438520c" TYPE="ext4" PARTLABEL="state" PARTUUID="1b57e20b-aa44-4e92-99f1-72f8b2340b27"
/dev/sdb2: LABEL="COS_OEM" UUID="228275ab-46cf-4553-ab84-082530123da1" TYPE="ext4" PARTLABEL="oem" PARTUUID="fae50aaf-d3c1-4e3e-a015-2a2da703d5b8"
/dev/sdb5: LABEL="COS_PERSISTENT" UUID="8ada0826-86b6-44b4-a665-bb6fd22bcfd9" TYPE="ext4" PARTLABEL="persistent" PARTUUID="75c0a4cf-2cdc-4642-9507-a36a837044ed"
/dev/sdb3: LABEL="COS_RECOVERY" UUID="3f1d509d-ab4c-4f13-961e-33f19bdc5d46" TYPE="ext4" PARTLABEL="recovery" PARTUUID="2ff71b29-d20a-452f-995c-a89612f49592"
/dev/sdb1: PARTLABEL="bios" PARTUUID="ce2cf7b1-ecc1-45da-9aa1-c01203ee332d"
/dev/sda: PTUUID="739d1a81" PTTYPE="dos"
[root@lpul-vault-k8s-server-0 kairos]#
[root@lpul-vault-k8s-server-0 kairos]#
[root@lpul-vault-k8s-server-0 kairos]# blkid /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0: PTUUID="739d1a81" PTTYPE="dos"
[root@lpul-vault-k8s-server-0 kairos]# blkid /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0: PTUUID="77275a43-53c5-4fdf-a62d-47f90365a745" PTTYPE="gpt"
[root@lpul-vault-k8s-server-0 kairos]# ll
bash: ll: command not found
[root@lpul-vault-k8s-server-0 kairos]# ls -alh /dev/disk/by-path
total 0
drwxr-xr-x 2 root root 180 Feb 22 19:29 .
drwxr-xr-x 8 root root 160 Feb 22 19:29 ..
lrwxrwxrwx 1 root root 9 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:0:0 -> ../../sda
lrwxrwxrwx 1 root root 9 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0-part3 -> ../../sdb3
lrwxrwxrwx 1 root root 10 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0-part4 -> ../../sdb4
lrwxrwxrwx 1 root root 10 Feb 22 19:29 pci-0000:03:00.0-scsi-0:0:1:0-part5 -> ../../sdb5
[root@lpul-vault-k8s-server-0 kairos]#
As you can see pci-0000:03:00.0-scsi-0:0:0:0
is the first disk, is sda
, is the disk I told Kairos to partition and is ignored even though I have no-format: true
set. It installed on pci-0000:03:00.0-scsi-0:0:1:0
which is disk 2.
I'm not sure what logs you want me to give you.
Here are the ones I know about.
journalctl -u kairos
Feb 22 19:29:03 lpul-vault-k8s-server-0.vault.ad.selinc.com systemd[1]: Starting kairos installer...
/run/immucore/immucore.log
2024-02-22T19:28:53Z INF Immucore commit=none compiled with=go1.20.2 version=v0.1.6
2024-02-22T19:28:53Z INF Stanza rd.cos.disable/rd.immucore.disable on the cmdline or booting from CDROM/Netboot/Squash recovery. Disabling immucore.
2024-02-22T19:28:53Z INF 1.
<init> (background: false) (weak: false) (run: false)
2.
<create-sentinel> (background: false) (weak: false) (run: false)
<wait-for-sysroot> (background: false) (weak: false) (run: false)
3.
<mount-oem> (background: false) (weak: false) (run: false)
4.
<rootfs-hook> (background: false) (weak: false) (run: false)
5.
<initramfs-hook> (background: false) (weak: false) (run: false)
2024-02-22T19:28:53Z INF Setting sentinel file to=live_mode
2024-02-22T19:28:59Z INF Running rootfs stage
2024-02-22T19:29:01Z INF Running initramfs stage
2024-02-22T19:29:02Z INF 1.
<init> (background: false) (weak: false) (run: false)
2.
<create-sentinel> (background: false) (weak: false) (run: true)
<wait-for-sysroot> (background: false) (weak: false) (run: true)
3.
<mount-oem> (background: false) (weak: false) (run: false)
4.
<rootfs-hook> (background: false) (weak: false) (run: true)
5.
<initramfs-hook> (background: false) (weak: false) (run: true)
/run/immucore/initramfs_stage.log
2024-02-22T19:29:01Z INF Running stage: initramfs.before
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f /oem/userdata ]: exit status 1)' stage name: Pull data from provider
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -e /sbin/openrc ]: exit status 1)' stage name: Blacklist bpfilter on Alpine ( bug: https://github.com/kairos-io/kairos/issues/277 )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run ! [[ -f /etc/hosts ]] || ! [[ $(grep '127.0.0.1' /etc/hosts) ]]
: exit status 1)' stage name: Make sure hosts file is present and includes a record for 127.0.0.1
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f /oem/userdata ]: exit status 1)' stage name:
2024-02-22T19:29:01Z INF Done executing stage 'initramfs.before'
2024-02-22T19:29:01Z INF Running stage: initramfs
2024-02-22T19:29:01Z INF Processing stage step 'Enable systemd-network config files for DHCP'. ( commands: 1, files: 2, ... )
2024-02-22T19:29:01Z INF Processing stage step 'Create journalctl /var/log/journal dir'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:01Z INF Processing stage step 'systemd-sysext initramfs settings'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "kairos.remote_recovery_mode" /proc/cmdline && \
( [ -e "/sbin/systemctl" ] || [ -e "/usr/bin/systemctl" ] || [ -e "/usr/sbin/systemctl" ] || [ -e "/usr/bin/systemctl" ] )
: exit status 1)' stage name: Starts kairos-recovery and generate a temporary pass
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -f "/sbin/openrc" ]
: exit status 1)' stage name: Create OpenRC services
2024-02-22T19:29:01Z INF Processing stage step ''. ( commands: 1, files: 0, ... )
2024-02-22T19:29:01Z ERR Failed to connect system bus: No such file or directory
: failed to run networkctl reload: exit status 1
2024-02-22T19:29:01Z ERR 1 error occurred:
* failed to run networkctl reload: exit status 1
2024-02-22T19:29:01Z INF Command output: Created symlink /etc/systemd/system/multi-user.target.wants/kairos-agent.service → /etc/systemd/system/kairos-agent.service.
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "kairos.remote_recovery_mode" /proc/cmdline && [ -f "/sbin/openrc" ]: exit status 1)' stage name: Starts kairos-recovery for openRC based systems
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f "/run/cos/recovery_mode" ] && [ ! -f "/run/cos/live_mode" ]: exit status 1)' stage name:
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f "/run/cos/recovery_mode" ] && [ -s /usr/local/etc/machine-id ]: exit status 1)' stage name: Restore /etc/machine-id for systemd systems
2024-02-22T19:29:01Z INF Processing stage step 'Disable NetworkManager and wicked'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -f "/sbin/openrc" ]
: exit status 1)' stage name: Enable OpenRC services
2024-02-22T19:29:01Z INF Processing stage step ''. ( commands: 0, files: 2, ... )
2024-02-22T19:29:01Z ERR 2 errors occurred:
* failed to run systemctl disable NetworkManager: exit status 1
* failed to run systemctl disable wicked: exit status 1
2024-02-22T19:29:01Z INF Processing stage step 'Enable systemd-network and systemd-resolved'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f "/run/cos/recovery_mode" ] && [ -f "/sbin/openrc" ]: exit status 1)' stage name: Restore /etc/machine-id for openrc systems
2024-02-22T19:29:01Z INF Processing stage step 'Default systemd config'. ( commands: 1, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "kairos.reset" /proc/cmdline && [ ! -f "/sbin/openrc" ]: exit status 1)' stage name: Starts kairos-reset for systemd based systems
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -qv "interactive-install" /proc/cmdline ] && \
[ -f /run/cos/live_mode ] && \
[ -f "/sbin/openrc" ]
: exit status 1)' stage name: Autologin on livecd for OpenRC
2024-02-22T19:29:01Z INF Command output: Created symlink /etc/systemd/system/default.target → /usr/lib/systemd/system/multi-user.target.
2024-02-22T19:29:01Z ERR 6 errors occurred:
* failed to run systemctl enable iscsid: exit status 1
* failed to run systemctl enable systemd-timesyncd: exit status 1
* failed to run systemctl enable nohang: exit status 1
* failed to run systemctl enable nohang-desktop: exit status 1
* failed to run systemctl enable fail2ban: exit status 1
* failed to run systemctl enable logrotate.timer: exit status 1
2024-02-22T19:29:01Z INF Processing stage step 'Generate host keys'. ( commands: 1, files: 0, ... )
2024-02-22T19:29:01Z INF Processing stage step 'Link /etc/resolv.conf to systemd resolv.conf'. ( commands: 2, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "kairos.reset" /proc/cmdline && [ -f "/sbin/openrc" ]: exit status 1)' stage name: Starts kairos-reset for openRC-based systems
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run cat /proc/cmdline | grep "selinux=1"
: exit status 1)' stage name: Relabelling
2024-02-22T19:29:01Z INF Command output:
2024-02-22T19:29:01Z INF Command output:
2024-02-22T19:29:02Z INF Command output: ssh-keygen: generating new host keys: RSA DSA ECDSA ED25519
2024-02-22T19:29:02Z INF Processing stage step 'Create systemd services'. ( commands: 0, files: 5, ... )
2024-02-22T19:29:02Z INF Processing stage step ''. ( commands: 5, files: 0, ... )
2024-02-22T19:29:02Z INF Command output: Removed "/etc/systemd/system/getty.target.wants/getty@tty1.service".
2024-02-22T19:29:02Z INF Command output: Running in chroot, ignoring command 'stop'
2024-02-22T19:29:02Z INF Command output: Created symlink /etc/systemd/system/getty@tty1.service → /dev/null.
2024-02-22T19:29:02Z INF Command output: Created symlink /etc/systemd/system/multi-user.target.wants/kairos.service → /etc/systemd/system/kairos.service.
2024-02-22T19:29:02Z INF Command output: Created symlink /etc/systemd/system/multi-user.target.wants/kairos-webui.service → /etc/systemd/system/kairos-webui.service.
2024-02-22T19:29:02Z INF Processing stage step 'Enable systemd services'. ( commands: 4, files: 0, ... )
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "nodepair.enable" /proc/cmdline && [ -f "/sbin/openrc" ]: exit status 1)' stage name:
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Processing stage step 'Setup groups'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "interactive-install" /proc/cmdline && \
( [ -e "/sbin/systemctl" ] || [ -e "/usr/bin/systemctl" ] || [ -e "/usr/sbin/systemctl" ] || [ -e "/usr/bin/systemctl" ] )
: exit status 1)' stage name:
2024-02-22T19:29:02Z INF Processing stage step 'Setup users'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "interactive-install" /proc/cmdline && [ -f "/sbin/openrc" ]: exit status 1)' stage name:
2024-02-22T19:29:02Z INF Processing stage step 'Set user password if running in live or uki'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z INF Processing stage step 'Setup sudo'. ( commands: 1, files: 1, ... )
2024-02-22T19:29:02Z INF Command output: Locking password for user root.
passwd: Success
2024-02-22T19:29:02Z INF Processing stage step 'Ensure runtime permission'. ( commands: 2, files: 0, ... )
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Command output:
2024-02-22T19:29:02Z INF Processing stage step ''. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -e "/usr/local/cloud-config" ]: exit status 1)' stage name: Ensure runtime permission
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -f "/sys/firmware/devicetree/base/model" ] && grep -i jetson "/sys/firmware/devicetree/base/model"
: exit status 1)' stage name: Create files
2024-02-22T19:29:02Z INF Processing stage step ''. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z INF Processing stage step 'Set hostname'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:02Z INF Done executing stage 'initramfs'
2024-02-22T19:29:02Z INF Running stage: initramfs.after
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -e /sbin/openrc ]: exit status 1)' stage name: Enable serial login for alpine
2024-02-22T19:29:02Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [[ $(kairos-agent state get kairos.flavor) =~ ^ubuntu ]]: exit status 1)' stage name: setupcon initramfs.after ubuntu
2024-02-22T19:29:02Z INF Done executing stage 'initramfs.after'
2024-02-22T19:29:02Z INF Running stage: initramfs.before
2024-02-22T19:29:02Z INF Done executing stage 'initramfs.before'
2024-02-22T19:29:02Z INF Running stage: initramfs
2024-02-22T19:29:02Z INF Done executing stage 'initramfs'
2024-02-22T19:29:02Z INF Running stage: initramfs.after
2024-02-22T19:29:02Z INF Done executing stage 'initramfs.after'
/run/immucore/rootfs_stage.log
2024-02-22T19:28:59Z INF Running stage: rootfs.before
2024-02-22T19:28:59Z INF Processing stage step 'Enable systemd-network config files for DHCP'. ( commands: 1, files: 2, ... )
2024-02-22T19:28:59Z INF Processing stage step 'Pull data from provider'. ( commands: 0, files: 0, ... )
2024-02-22T19:28:59Z ERR mkdir /etc/systemd/network/: file exists
2024-02-22T19:28:59Z ERR 1 error occurred:
* mkdir /etc/systemd/network/: file exists
2024-02-22T19:28:59Z ERR Failed to connect system bus: No such file or directory
: failed to run networkctl reload: exit status 1
2024-02-22T19:28:59Z ERR 1 error occurred:
* failed to run networkctl reload: exit status 1
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f /oem/userdata ]: exit status 1)' stage name: Sentinel file for userdata
2024-02-22T19:29:01Z INF Done executing stage 'rootfs.before'
2024-02-22T19:29:01Z INF Running stage: rootfs
2024-02-22T19:29:01Z INF Processing stage step 'Layout configuration for active/passive mode'. ( commands: 0, files: 0, ... )
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -f "/run/cos/recovery_mode" ]: exit status 1)' stage name: Layout configuration for recovery mode
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run grep -q "kairos.boot_live_mode" /proc/cmdline: exit status 1)' stage name: Layout configuration for booting local node from livecd
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -e "/run/cos/uki_boot_mode" ]: exit status 1)' stage name: Layout configuration for UKI boot
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -e "/run/cos/uki_install_mode" ]: exit status 1)' stage name: Layout configuration for UKI installer
2024-02-22T19:29:01Z INF Done executing stage 'rootfs'
2024-02-22T19:29:01Z INF Running stage: rootfs.after
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f /run/cos/recovery_mode ] && [ ! -f /run/cos/live_mode ] && [ -f "/sys/firmware/devicetree/base/model" ] && grep -i "Raspberry Pi 4" "/sys/firmware/devicetree/base/model": exit status 1)' stage name: Grow persistent
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ -r /run/cos/custom-layout.env ] && [ ! -f "/run/cos/recovery_mode" ] && [ ! -f /run/cos/live_mode ]: exit status 1)' stage name: add custom bind and ephemeral mounts to /run/cos/cos-layout.env
2024-02-22T19:29:01Z WRN (conditional) Skip 'Skipping stage (if statement error: failed to run [ ! -f /run/cos/recovery_mode ] && [ ! -f /run/cos/live_mode ]: exit status 1)' stage name: Grow persistent
2024-02-22T19:29:01Z INF Done executing stage 'rootfs.after'
2024-02-22T19:29:01Z INF Running stage: rootfs.before
2024-02-22T19:29:01Z INF Done executing stage 'rootfs.before'
2024-02-22T19:29:01Z INF Running stage: rootfs
2024-02-22T19:29:01Z INF Done executing stage 'rootfs'
2024-02-22T19:29:01Z INF Running stage: rootfs.after
2024-02-22T19:29:01Z INF Done executing stage 'rootfs.after'
Please help. This isn't stable and usable with multiple disks which is a requirement for us. Thank you.
First of all, just to get it out of the equation, in your cloud config above, the #cloud-config
header is missing but I assume it's just a copy-paste mistake otherwise you wouldn't even see the disk being partitioned.
That said, I can verify that no-format doesn't work as expected. I created a VM in qemu with 2 disks:
/dev/vda
-> 80G/dev/vdb
-> 40GKairos would automatically pick /dev/vda
either because it's the "first" disk or because it's the bigger disk. In any case, my goal is to point the installation to a manually partitioned /dev/vdb
.
I used this config:
#cloud-config
strict: true
debug: true
install:
no-format: true
auto: true
poweroff: false
reboot: false
grub_options:
extra_cmdline: "rd.immucore.debug"
users:
- name: "kairos"
passwd: "kairos"
stages:
kairos-install.pre.before:
- if: '[ -e "/dev/vdb" ]'
name: "Create partitions"
commands:
- |
parted --script --machine -- "/dev/vdb" mklabel msdos
sgdisk -g /dev/vdb
layout:
device:
path: "/dev/vdb"
expand_partition:
size: 0 # All available space
add_partitions:
# all sizes bellow are in MB
- pLabel: bios
size: 1
pType: gpt
- fsLabel: COS_OEM
size: 64
pLabel: oem
- fsLabel: COS_RECOVERY
size: 8500
pLabel: recovery
- fsLabel: COS_STATE
size: 18000
pLabel: state
- fsLabel: COS_PERSISTENT
pLabel: persistent
size: 0
filesystem: "ext4"
boot:
- systemd_firstboot:
keymap: us
which is almost similar to @sarg3nt 's config but pointing to /dev/vdb
(plus a sgdisk -g /dev/vdb
command to fix an error about the disk being MBR and not GPT).
I compiled a kairos-agent with additional output and it seems that this line overwrites my NoFormat
option: https://github.com/kairos-io/kairos-agent/blob/2e9c85e63acf926ab9e0a00b3dabff4927c70c4b/internal/agent/install.go#L270 :
installSpec.NoFormat = true
c.Install.NoFormat = false
I tried to comment it out to see what happens and indeed /dev/vda
is not formated or partitioned but it's still selected as the target:
i.spec.Target = /dev/vda
(printed at this point: https://github.com/kairos-io/kairos-agent/blob/2e9c85e63acf926ab9e0a00b3dabff4927c70c4b/pkg/action/install.go#L164)
and it later fails with:
^[[36mINFO^[[0m[2024-02-23T08:30:34Z] Installing GRUB..
^[[37mDEBU^[[0m[2024-02-23T08:30:34Z] Running grub with the following args: [--root-directory=/run/cos/active --boot-directory=/run/cos/state --target=i386-pc /dev/vda]
^[[37mDEBU^[[0m[2024-02-23T08:30:34Z] Running cmd: '/usr/sbin/grub2-install --root-directory=/run/cos/active --boot-directory=/run/cos/state --target=i386-pc /dev/vda'
^[[31mERRO^[[0m[2024-02-23T08:30:38Z] Installing for i386-pc platform.
/usr/sbin/grub2-install: error: unable to identify a filesystem in hostdisk//dev/vda; safety check can't be performed.
where it obviously tries to install grub on /dev/vda
which is not even partitioned.
The selection of the target device doesn't take "NoFormat" into account: https://github.com/kairos-io/kairos-agent/blob/2e9c85e63acf926ab9e0a00b3dabff4927c70c4b/internal/agent/install.go#L216-L218
I think when NoFormat
is set to true, the target device should be discovered using labels (ideally with a sanity check that all needed partitions are there).
I found the offending parts of the code here: https://github.com/kairos-io/kairos-agent/pull/235
Needs a proper fix.
@jimmykarily Thank you for looking into this and finding the problem. Any idea when this will get on the docket for a proper fix?
@jimmykarily Thank you for looking into this and finding the problem. Any idea when this will get on the docket for a proper fix?
I can't make predictions, sorry. With the focus being on v3.0.0 and the UKI work, this only made it below the waterline this sprint. If things go well, we may be able to start on it :shrug:
Peg PR to allow creating more than one disk on a test VM: https://github.com/spectrocloud/peg/pull/23 (will be used to implement a test for this ticket)
@jimmykarily Now that Kairos v3 is out and looks fairly stabilized, do you have an estimate as to when this is going to be fixed? Thanks!
Waiting for this to be merged: https://github.com/kairos-io/kairos/pull/2291 . We also need to make sure this is properly documented.
@jimmykarily
I've been trying to get this to work with the master build kairos/rockylinux:9-core-amd64-generic-master
most of the day and am not having any luck.
Here's all the different things I tried and how they failed.
Starting with the cloud_init.yaml
I initially posted above I get this.
As per one of your posts I then tried adding the sgdisk command.
commands:
- |
parted --script --machine -- "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" mklabel msdos
sgdisk -g "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
When I do that I get the following: But then fails with
I saw somewhere else that added
add_partitions:
# all sizes bellow are in MB
- fsLabel: COS_GRUB
size: 64
pLabel: bios # or efi, tried both
filesystem: "fat"
- fsLabel: COS_OEM
size: 64
<snip>
But that didn't help or change anything.
What am I missing? I'm not a grub master so kind of clutching at straws here.
I was struggling to find the right combination too. I ended up doing this: https://github.com/kairos-io/kairos/pull/2291/files#diff-1ff1699e612ac7f8c508e5f9f6a784b37441b01b8cfdebd8da3b068280385247R115 for legacy bios mode (see how the COS_GRUB
partition is commented out some lines below).
For EFI what worked for me, was to comment out the sgdisk
command and uncomment the COS_GRUB
part.
To avoid trying things blindinly, what I did was, I left kairos-agent install on automatically on the default disk. Then I save the partition scheme and tried to replicate it manually but pointing to the other disk. This way you'll know what partitions kairos-agent expects.
@jimmykarily
I got this working but with an unexpected necessity that is kind of a worry.
I'm having to specify device: /dev/sda
or it install loops.
That doesn't fill me with confidence since /dev/sda
can flip around from boot to boot.
I didn't try device: auto
I had just left it out as you showed in your example. I was assuming that was auto
?
I got a cluster built and all the nodes worked with the install disk being the smallest, so that is progress, just still worried about the device
statement.
We are wanting to go production with the first cluster soon and I need to ensure my team this is going to be stable. What do you think?
Here's my config.
strict: true
debug: true
install:
no-format: true
device: /dev/sda
auto: true
poweroff: false
reboot: true
grub_options:
extra_cmdline: "rd.immucore.debug"
bind_mounts:
- /run/k3s
stages:
kairos-install.pre.before:
- if: '[ -e "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" ]'
name: "Create partitions"
commands:
- |
parted --script --machine -- "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" mklabel gpt
# Legacy bios
sgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
layout:
device:
path: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
expand_partition:
size: 0 # All available space
add_partitions:
# all sizes bellow are in MB
- fsLabel: COS_OEM
size: 64
pLabel: oem
- fsLabel: COS_RECOVERY
size: 8500
pLabel: recovery
- fsLabel: COS_STATE
size: 18000
pLabel: state
- fsLabel: COS_PERSISTENT
pLabel: persistent
size: 0
filesystem: "ext4"
@sarg3nt do you have the kairos-agent
installation logs (with debug enabled) from the case when device
is not set? Looking at the code, this should let kairos-agent
auto detect the target device and it should print this text: https://github.com/kairos-io/kairos-agent/blob/979c4ad32b7a9eceadde33f728bd1f7c427daae0/pkg/action/install.go#L162
@jimmykarily I don't know if I'm doing this right but I'm giving it my best shot.
Here's what I've tried and "figured out" so far.
When I have device: /dev/sda
set, it works. Even when the node reboots and sda
is pointing at the wrong disk it still works.
See the output of lsblk
below for an example.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 1G 1 loop /
sda 8:0 0 150G 0 disk /run/k3s
sdb 8:16 0 80G 0 disk
├─sdb1 8:17 0 1M 0 part
├─sdb2 8:18 0 64M 0 part /oem
├─sdb3 8:19 0 2.2G 0 part
├─sdb4 8:20 0 4G 0 part /run/initramfs/cos-state
└─sdb5 8:21 0 73.6G 0 part /etc/pki/tls/certs
/var/lib/wicked
/var/lib/snapd
/var/lib/rancher
/var/lib/longhorn
/var/lib/kubelet
/var/lib/extensions
/var/lib/dbus
/var/lib/containerd
/var/lib/cni
/var/lib/ca-certificates
/etc/zfs
/etc/systemd
/etc/sysconfig
/etc/ssh
/var/snap
/etc/runlevels
/etc/rancher
/etc/modprobe.d
/var/log
/usr/libexec
/etc/kubernetes
/run/k3s
/etc/iscsi
/etc/init.d
/etc/cni
/root
/opt
/home
/usr/local
sdc 8:32 0 100G 0 disk /usr/local/.state/var-lib-rancher.bind/rke2
/var/lib/rancher/rke2
To be clear, in the above sdb
is disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0
, i.e. disk 0, sdc
is disk 1 and sda
is disk 2 so everything installed correctly but the sdX
devices are just wrong as per the original issue statement. Regardless, this works. It boots and stuff is on the correct disk.
When I leave out device
or do device: auto
it "install loops". If I turn off auto shutdown and ssh into the node while the installer is still up and run these troubleshooting commands:
echo "Disk 0, should be Kairos stuff"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
blkid "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
echo ""
echo "Disk 1, should be /var/lib/rancher/rke2"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0"
blkid "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0"
echo ""
echo "Disk 2, should be /run/k3s"
lsblk -f "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0"
blkid "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0"
I get this output:
Disk 0, should be Kairos stuff
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda
└─sda1
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0: PTUUID="196e314b-07a0-45ee-b82b-419582391e6e" PTTYPE="gpt"
Disk 1, should be /var/lib/rancher/rke2
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sdb ext4 1.0 RKE2 2ee2eadc-7cb7-4231-a2c8-e79ca4ab61a7
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0: LABEL="RKE2" UUID="2ee2eadc-7cb7-4231-a2c8-e79ca4ab61a7" TYPE="ext4"
Disk 2, should be /run/k3s
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sdc
├─sdc1
├─sdc2 ext4 1.0 COS_OEM 43ca157c-cecb-4c6c-9340-ab2d1d61b765
├─sdc3 ext4 1.0 COS_RECOVERY 6f983e16-8658-4a44-a754-1cfa7883b3f0
├─sdc4 ext4 1.0 COS_STATE 78ce8c4d-92d5-4210-be27-13e20b3ec07f
└─sdc5 ext4 1.0 COS_PERSISTENT d32ad1ff-9738-4b34-ac3d-6f62110e6800
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0: PTUUID="a95f1345-adcc-4656-ae7d-17bfb3e08f5b" PTTYPE="gpt"
It seems to have installed to Disk 2? Or am I reading / interpreting this wrong?
Again, when using device: /dev/sda
it works and the output of the above commands is correct.
So it makes sense it's installing looping, it is installing to the wrong disk, ignoring the commands in the clout_init.yaml
?
I"m confused. Why would setting device: /dev/sda
do that?
I tried getting the logs you asked for. I can get something when I don't have it shut down and I ssh in to the node:
[root@lpul-vault-k8s-server-0 kairos]# journalctl -u kairos-agent
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com systemd[1]: Started kairos agent.
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: warning: skipping /oem/userdata (extension).
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Kairos Agent version=v2.8.11
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Kairos System version=v3.0.6
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF creating a runtime
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF detecting boot state
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Boot Mode boot_mode=livecd_boot
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Boot in uki mode result=false
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: warning: skipping /oem/userdata (extension).
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Kairos Agent version=v2.8.11
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Kairos System version=v3.0.6
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF creating a runtime
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF detecting boot state
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Boot Mode boot_mode=livecd_boot
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1388]: 2024-04-23T18:34:30Z INF Boot in uki mode result=false
Apr 23 18:34:30 lpul-vault-k8s-server-0.vault.ad.selinc.com systemd[1]: kairos-agent.service: Deactivated successfully.
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com systemd[1]: Started kairos agent.
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: warning: skipping /oem/userdata (extension).
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF Kairos Agent version=v2.8.11
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF Kairos System version=v3.0.6
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF creating a runtime
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF detecting boot state
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF Boot Mode boot_mode=livecd_boot
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com kairos-agent[1458]: 2024-04-23T18:34:31Z INF Boot in uki mode result=false
Apr 23 18:34:31 lpul-vault-k8s-server-0.vault.ad.selinc.com systemd[1]: kairos-agent.service: Deactivated successfully.
But that doesn't look that useful to me.
I had tried shutting down the node and adding the disk to another running VM then mounting it but disk 0 wouldn't mount.
Once I realized it wasn't installing to disk 0 I then did the same but with disk 2 and that worked.
I mounted /dev/sdb5
which had the /var/log
directory and looked at .state/var-log.bind/journal
using journalctl -D journal
and found there were no kairos-agent
logs there . . . ? Lots of other logs.
I'll post those logs in the next comment to keep this one more readable.
A few notes to make sure you are aware of the whole setup:
cloud_init.yaml
and restart the container, then stop the cluster node, replace the disks if needed and restart the node. This is just to speed up testing and I get the same results if automation is used. Another peace of info that may or may not be useful. This is the AuroraBoot run statement. This is part of a shell file that is ran as a service on the node and I haven't touched it in a while.
docker run --rm --net host \
-v "/usr/local/auroraboot-build:/tmp/auroraboot" \
-v "/etc/systemd/system/cloud_init.yaml:/cloud_init.yaml" \
-v /var/run/docker.sock:/var/run/docker.sock \
"quay.artifactory.metro.ad.selinc.com/kairos/auroraboot:${AURORABOOT_VERSION}" \
--set "container_image=$container_image" \
--cloud-config /cloud_init.yaml \
--set "state_dir=/tmp/auroraboot" \
--set netboot.cmdline="rd.neednet=1 ip=dhcp rd.cos.disable netboot nodepair.enable console=tty0 selinux=0" \
--debug \
I'm curios about the --set netboot.cmdline
arg. Is that still necessary. I"m not seeing it recommended in the docs now and I'm not sure if it's still needed or even what it does. I tried removing it but things didn't seem to change.
Hope this helps.
See attached log file. kairos_logs.txt
When you are setting device: /dev/sda
you are essentially skipping the target detection. Given /dev/sdX
disks can change, I suspect the only reason it works for you is because it so happens that /dev/sda
is the right disk at installation time (and obviously changes after reboot).
In the config you shared above, with device: /dev/sda
set and manual partitioning happening in the kairos-install.pre.before
, either disk is partitioned twice or one of the two is skipped, I'm not sure, the installation logs would help here.
It's possible that when you don't set the device explicitly, for some reason the detection doesn't work and the target is left empty. But that would need installSpec.NoFormat
to not be set correctly too, otherwise nothing would set the Target and the installation would fail.
@sarg3nt the logs you shared are not the installation logs. I'm not sure if those are available after rebooting to the system. You can get the installation logs by:
/tmp/config.yaml
kairos-agent manual-install /tmp/config.yaml 2>&1 | tee out.log
Maybe there are other ways to get the installation logs in the auroraboot flow but I can't think one right now. Maybe if you set reboot: false
in the config you get an opportunity to ssh to the box while still in livecd mode. The installation logs should be still around in that case.
One of the 2 options should allow you to collect installation logs and that will reveal more on what actually happens.
Thanks for your patience in fixing this Dave, let's hope we get it sorted out soon!
The logs you attached show immucore v0.1.6
:
Apr 23 17:32:49 localhost immucore[589]: 2024-04-23T17:32:49Z INF Immucore commit=none compiled with=go1.20.2 version=v0.1.6
the image you use should be v0.1.25
:
$ docker run quay.io/kairos/rockylinux:9-core-amd64-generic-master immucore version
2024-04-24T07:53:48Z INF Immucore commit=none compiled with=go1.21.7 version=v0.1.25
Something is off...
thanks @Itxaka for spotting this
@jimmykarily
I don't know where it got version=v0.1.6
from. I got those logs in a very roundabout way and was a little sus when they were localhost
In any case, my built image was reporting version=v0.1.25
.
I rebuilt the client OS image from the latest master and our AuroraBoot image even though we were already running AuroraBoot v0.2.7
and I can now leave the device:
line out and everything works. Not sure what changed though.
Your info about booting into live-cd mode and running kairos-agent manual-install
was a great tip.
I tried it with the node net booting from the AuroraBoot image but with the cloud-config
the AuroraBoot node was serving set to auto: false
, reboot: false
and poweroff: false
and tried to use that as the starting point. I know that file is being served to the downstream node but don't know where it ends up so I followed your instructions and saved it to /tmp/config.yaml
and ran the manual-install
as you showed. It looked like some things were running twice so I think it's using the file I gave it and the one it was served from AuoraBoot maybe?
Q: Is there a way to run kairos-agent install
and have it use the one it was served from net-boot?
The bigger surprise is that cloud_init.yaml
being served from the target nodes VSphere guestinfo.userdata
is being ran even when auto: false
is set. I assumed it would not, but it is.
To explain in more detail. The target node gets our custom config from two sources. The first is the basic config that is the same for all cluster nodes and is served from AuroraBoot. This config has the install:
section in it. The second is the config that is different for each node, that is injected form Terraform into VSphere via guestinfo.userdata
. The VSphere one is ran but the first is not.
Q: Is this expected?
I'll include redacted copies of each below, to help clarify what is happening.
From AuroraBoot:
#cloud-config
strict: true
debug: true
install:
no-format: true
auto: false
poweroff: false
reboot: false
grub_options:
extra_cmdline: "rd.immucore.debug"
bind_mounts:
- /run/k3s
users:
- name: "kairos-auroraboot"
passwd: "<redacted>"
ssh_authorized_keys:
- <redacted>
write_files:
- encoding: b64
content: <redacted>
path: <redacted>
permissions: "0444"
runcmd:
- Some run commands here.
stages:
kairos-install.pre.before:
- if: '[ -e "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" ]'
name: "Create partitions"
commands:
- |
parted --script --machine -- "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0" mklabel gpt
# Legacy bios
sgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
layout:
device:
path: "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
expand_partition:
size: 0 # All available space
add_partitions:
- fsLabel: COS_OEM
size: 64
pLabel: oem
- fsLabel: COS_RECOVERY
size: 8500
pLabel: recovery
- fsLabel: COS_STATE
size: 18000
pLabel: state
- fsLabel: COS_PERSISTENT
pLabel: persistent
size: 0
filesystem: "ext4"
boot:
- systemd_firstboot:
keymap: us
- name: "Environment Variables"
environment:
HTTP_PROXY: "<redacted>"
<snip>
- name: "Setup services"
systemctl:
disable:
- dnf-makecache
- name: "Setup NTP"
systemctl:
enable:
- systemd-timesyncd
timesyncd:
NTP: "<redacted>"
FallbackNTP: ""
after-install-chroot:
- name: "Create data directories"
commands:
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0" "/var/lib/rancher/rke2" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0" "/var/lib/rancher/longhorn" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- name: "Format disks"
commands:
- make_disk.sh "format_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0" "/var/lib/rancher/rke2" "RKE2" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "format_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" "K3S" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "format_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0" "/var/lib/rancher/longhorn" "LONGHORN" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
after-reset-chroot:
- name: "Create data directories"
commands:
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0" "/var/lib/rancher/rke2" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0" "/var/lib/rancher/longhorn" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
after-upgrade-chroot:
- name: "Create data directories"
commands:
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0" "/var/lib/rancher/rke2" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0" "/var/lib/rancher/longhorn" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
initramfs:
- name: "Mount disks"
commands:
# Making the /run/k3s directory here as well as it fixes the directory going missing bug
- make_disk.sh "make_directory" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "mount_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0" "/var/lib/rancher/rke2" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "mount_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0" "/run/k3s" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
- make_disk.sh "mount_disk" "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0" "/var/lib/rancher/longhorn" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
The file that is injected via VSphere and is running on startup: This ends up in /oem/userdata.yaml
and /oem/userdata
#cloud-config
users:
- name: "kairos"
passwd: "<redacted>"
ssh_authorized_keys:
- ssh-rsa <redacted>
write_files:
# These files exist after startup.
- encoding: b64
content: '<redacted>'
path: /etc/rancher/rke2/config.yaml
permissions: "0644"
owner: "root"
- encoding: b64
content: '<redacted>'
path: /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx-config.yaml
permissions: "0644"
owner: "root"
stages:
initramfs:
- name: "Set hostname"
hostname: "lpul-vault-k8s-server-0.vault.ad.selinc.com"
- name: "Run commands"
commands:
- bash /usr/bin/initramfs_scripts.sh 2>&1 | tee -a /var/log/sel/initramfs_scripts.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi
boot:
- name: "Setup services"
systemctl:
enable:
- rke2-server.timer
- vmtoolsd.timer
- qualys-cloud-agent.timer
- falcon-sensor.timer
start:
- rke2-server.timer
- vmtoolsd.timer
- qualys-cloud-agent.timer
- falcon-sensor.timer
And here's the log after running kairos-agent manual-install /tmp/config.yaml 2>&1 | tee out.log
with the first file above.
[root@lpul-vault-k8s-server-0 tmp]# kairos-agent manual-install /tmp/config.yaml 2>&1 | tee out.log
2024-04-24T20:21:09Z INF Kairos Agent version=v2.9.1
2024-04-24T20:21:09Z DBG Kairos Agent version={"git_commit":"none","go_version":"go1.21.7","version":"v2.9.1"}
2024-04-24T20:21:09Z INF Kairos System version=v3.0.4-43-g595a9d5
2024-04-24T20:21:09Z INF creating a runtime
2024-04-24T20:21:09Z INF detecting boot state
2024-04-24T20:21:09Z INF Boot Mode boot_mode=livecd_boot
2024-04-24T20:21:09Z INF Boot in uki mode result=false
2024-04-24T20:21:09Z DBG Loaded config: &config.Config{
Install: &config.Install{
Auto: false,
Reboot: false,
NoFormat: true,
Device: "",
Poweroff: false,
GrubOptions: map[string]string{
"extra_cmdline": "rd.immucore.debug",
},
Bundles: nil,
Encrypt: nil,
SkipEncryptCopyPlugins: false,
Env: nil,
Source: "",
EphemeralMounts: nil,
BindMounts: []string{
"/run/k3s",
},
},
Config: collector.Config{
"config_url": "http://10.105.148.76:8090/_/file?name=other-1",
"debug": true,
"install": collector.Config{
"auto": false,
"bind_mounts": []interface {}{
"/run/k3s",
},
"grub_options": collector.Config{
"extra_cmdline": "rd.immucore.debug",
},
"no-format": true,
"poweroff": false,
"reboot": false,
},
"runcmd": []interface {}{
"ln -s /opt/qualys/ /usr/local/qualys",
"/opt/qualys/cloud-agent/bin/qualys-cloud-agent.sh ActivationId=1751b9b6-ccde-462d-aafa-cfd03d71acd3 CustomerId=ef6a4b08-1375-70aa-81bb-7bfa031eec64",
"/opt/CrowdStrike/falconctl -s -f --aph=wall.ad.selinc.com --app=8080 --apd=false --cid=FC92F4C7EADF4A30B3AE88AD6FD371B7-74",
},
"stages": collector.Config{
"after-install-chroot": []interface {}{
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" \"RKE2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" \"K3S\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" \"LONGHORN\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Format disks",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" \"RKE2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" \"K3S\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"format_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" \"LONGHORN\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Format disks",
},
},
"after-reset-chroot": []interface {}{
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
},
"after-upgrade-chroot": []interface {}{
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
},
"boot": []interface {}{
collector.Config{
"keymap": "us",
"systemd_firstboot": nil,
},
collector.Config{
"environment": collector.Config{
"HTTPS_PROXY": "http://wall.ad.selinc.com:8080",
"HTTP_PROXY": "http://wall.ad.selinc.com:8080",
"NO_PROXY": "localhost,localaddress,svc.cluster.local,host.docker.internal,kubernetes.docker.internal,.svc.cluster.local,cluster.local,.cluster.local,default.svc,docker.sel.inc,sel.inc,.sel.inc,ad.selinc.com,.ad.selinc.com,metro.ad.selinc.com,.metro.ad.selinc.com,bitbucket.metro.ad.selinc.com,artifactory.metro.ad.selinc.com,*.ad.selinc.com,10.43.0.1,127.0.0.1,127.0.0.0,0.0.0.0,127.0.0.0/8,10.0.0.0/8,10.*.*.*,10.*,172.16.0.0/12,192.168.0.0/16,169.254.169.254",
"http_proxy": "http://wall.ad.selinc.com:8080",
"https_proxy": "http://wall.ad.selinc.com:8080",
"no_proxy": "localhost,localaddress,svc.cluster.local,host.docker.internal,kubernetes.docker.internal,.svc.cluster.local,cluster.local,.cluster.local,default.svc,docker.sel.inc,sel.inc,.sel.inc,ad.selinc.com,.ad.selinc.com,metro.ad.selinc.com,.metro.ad.selinc.com,bitbucket.metro.ad.selinc.com,artifactory.metro.ad.selinc.com,*.ad.selinc.com,10.43.0.1,127.0.0.1,127.0.0.0,0.0.0.0,127.0.0.0/8,10.0.0.0/8,10.*.*.*,10.*,172.16.0.0/12,192.168.0.0/16,169.254.169.254",
},
"name": "Environment Variables",
},
collector.Config{
"name": "Setup services",
"systemctl": collector.Config{
"disable": []interface {}{
"dnf-makecache",
},
},
},
collector.Config{
"name": "Setup NTP",
"systemctl": collector.Config{
"enable": []interface {}{
"systemd-timesyncd",
},
},
"timesyncd": collector.Config{
"FallbackNTP": "",
"NTP": "ntp.ad.selinc.com ntp2.ad.selinc.com ntp3.ad.selinc.com",
},
},
collector.Config{
"keymap": "us",
"systemd_firstboot": nil,
},
collector.Config{
"environment": collector.Config{
"HTTPS_PROXY": "http://wall.ad.selinc.com:8080",
"HTTP_PROXY": "http://wall.ad.selinc.com:8080",
"NO_PROXY": "localhost,localaddress,svc.cluster.local,host.docker.internal,kubernetes.docker.internal,.svc.cluster.local,cluster.local,.cluster.local,default.svc,docker.sel.inc,sel.inc,.sel.inc,ad.selinc.com,.ad.selinc.com,metro.ad.selinc.com,.metro.ad.selinc.com,bitbucket.metro.ad.selinc.com,artifactory.metro.ad.selinc.com,*.ad.selinc.com,10.43.0.1,127.0.0.1,127.0.0.0,0.0.0.0,127.0.0.0/8,10.0.0.0/8,10.*.*.*,10.*,172.16.0.0/12,192.168.0.0/16,169.254.169.254",
"http_proxy": "http://wall.ad.selinc.com:8080",
"https_proxy": "http://wall.ad.selinc.com:8080",
"no_proxy": "localhost,localaddress,svc.cluster.local,host.docker.internal,kubernetes.docker.internal,.svc.cluster.local,cluster.local,.cluster.local,default.svc,docker.sel.inc,sel.inc,.sel.inc,ad.selinc.com,.ad.selinc.com,metro.ad.selinc.com,.metro.ad.selinc.com,bitbucket.metro.ad.selinc.com,artifactory.metro.ad.selinc.com,*.ad.selinc.com,10.43.0.1,127.0.0.1,127.0.0.0,0.0.0.0,127.0.0.0/8,10.0.0.0/8,10.*.*.*,10.*,172.16.0.0/12,192.168.0.0/16,169.254.169.254",
},
"name": "Environment Variables",
},
collector.Config{
"name": "Setup services",
"systemctl": collector.Config{
"disable": []interface {}{
"dnf-makecache",
},
},
},
collector.Config{
"name": "Setup NTP",
"systemctl": collector.Config{
"enable": []interface {}{
"systemd-timesyncd",
},
},
"timesyncd": collector.Config{
"FallbackNTP": "",
"NTP": "ntp.ad.selinc.com ntp2.ad.selinc.com ntp3.ad.selinc.com",
},
},
},
"initramfs": []interface {}{
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Mount disks",
},
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"mount_disk\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Mount disks",
},
},
"kairos-install.pre.before": []interface {}{
collector.Config{
"commands": []interface {}{
"parted --script --machine -- \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\" mklabel gpt\n# Legacy bios\nsgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\"\n",
},
"if": "[ -e \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\" ]",
"layout": collector.Config{
"add_partitions": []interface {}{
collector.Config{
"fsLabel": "COS_OEM",
"pLabel": "oem",
"size": 64,
},
collector.Config{
"fsLabel": "COS_RECOVERY",
"pLabel": "recovery",
"size": 8500,
},
collector.Config{
"fsLabel": "COS_STATE",
"pLabel": "state",
"size": 18000,
},
collector.Config{
"filesystem": "ext4",
"fsLabel": "COS_PERSISTENT",
"pLabel": "persistent",
"size": 0,
},
},
"device": collector.Config{
"path": "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0",
},
"expand_partition": collector.Config{
"size": 0,
},
},
"name": "Create partitions",
},
collector.Config{
"commands": []interface {}{
"parted --script --machine -- \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\" mklabel gpt\n# Legacy bios\nsgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\"\n",
},
"if": "[ -e \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\" ]",
"layout": collector.Config{
"add_partitions": []interface {}{
collector.Config{
"fsLabel": "COS_OEM",
"pLabel": "oem",
"size": 64,
},
collector.Config{
"fsLabel": "COS_RECOVERY",
"pLabel": "recovery",
"size": 8500,
},
collector.Config{
"fsLabel": "COS_STATE",
"pLabel": "state",
"size": 18000,
},
collector.Config{
"filesystem": "ext4",
"fsLabel": "COS_PERSISTENT",
"pLabel": "persistent",
"size": 0,
},
},
"device": collector.Config{
"path": "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0",
},
"expand_partition": collector.Config{
"size": 0,
},
},
"name": "Create partitions",
},
},
},
"strict": true,
"users": []interface {}{
collector.Config{
"name": "kairos-auroraboot",
"passwd": "kairos",
"ssh_authorized_keys": []interface {}{
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCw/XgWQOq5Nx46cl2phALYdoJRoINuqD+cT9arVc6XMx4gl0KO7c98Po/Y/rPcTtnrqxaSOCaOSVB2slnEovKAEnXwchH1Ndub937MtSxDyhc5eiwoEj2nYgJ0QrTfQdFBim0ysvWxpJpLGYyR32idhI67vtcq3LDjqW1lFoIcx/X1/L7qn5/b81N+tg6vwE2Li0+fxFlMbTxuFwSdBLzGI51wqDCnWBb6N2IXfHzSv8o4l52fZ0UtwC0TT1ACmh7T+bP/cZ/Dxno4iOdLX9WbqEZC3lKeXqvjzKDyrAwu2/m7e5Lhd+OHUgIjw2rLypHErSFADazcycxM0FvORVtprcaTvgBpK9bZqn8a40JrHYb9Z/0swn1HC0KhtYSBpl4/nRZkvb9iAFCA0QYdmVwRrQ8sb8TTQHYmGf+svdfvyCs+GHWG3h0blFMH66AucLMnUR5hulNGkd+6Y2dNsH9OpQspNYfH/9mV3PJFSICxPKFybC9vwV3MuKSRMdQ77dc= davesarg-sa@sargesavm",
},
},
collector.Config{
"name": "kairos-auroraboot",
"passwd": "IXOxMlpGW4qmXaJSd0e4zs1xdqU91wPRVdFtmvVM0eNszYliaGr5hXlpCtis7oFf",
"ssh_authorized_keys": []interface {}{
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCw/XgWQOq5Nx46cl2phALYdoJRoINuqD+cT9arVc6XMx4gl0KO7c98Po/Y/rPcTtnrqxaSOCaOSVB2slnEovKAEnXwchH1Ndub937MtSxDyhc5eiwoEj2nYgJ0QrTfQdFBim0ysvWxpJpLGYyR32idhI67vtcq3LDjqW1lFoIcx/X1/L7qn5/b81N+tg6vwE2Li0+fxFlMbTxuFwSdBLzGI51wqDCnWBb6N2IXfHzSv8o4l52fZ0UtwC0TT1ACmh7T+bP/cZ/Dxno4iOdLX9WbqEZC3lKeXqvjzKDyrAwu2/m7e5Lhd+OHUgIjw2rLypHErSFADazcycxM0FvORVtprcaTvgBpK9bZqn8a40JrHYb9Z/0swn1HC0KhtYSBpl4/nRZkvb9iAFCA0QYdmVwRrQ8sb8TTQHYmGf+svdfvyCs+GHWG3h0blFMH66AucLMnUR5hulNGkd+6Y2dNsH9OpQspNYfH/9mV3PJFSICxPKFybC9vwV3MuKSRMdQ77dc= davesarg-sa@sargesavm}",
},
},
},
"write_files": []interface {}{
collector.Config{
"content": "cXVhbHlzX2h0dHBzX3Byb3h5PWh0dHA6Ly93YWxsLmFkLnNlbGluYy5jb206ODA4MAo=",
"encoding": "b64",
"path": "/etc/sysconfig/qualys-cloud-agent",
"permissions": "0444",
},
collector.Config{
"content": "cXVhbHlzX2h0dHBzX3Byb3h5PWh0dHA6Ly93YWxsLmFkLnNlbGluYy5jb206ODA4MAo=",
"encoding": "b64",
"path": "/etc/sysconfig/qualys-cloud-agent",
"permissions": "0444",
},
},
},
ConfigURL: "http://10.105.148.76:8090/_/file?name=other-1",
Options: map[string]string(nil), // p0
FailOnBundleErrors: false,
Bundles: nil,
GrubOptions: p0,
Env: nil,
Debug: true,
Strict: true,
CloudInitPaths: nil,
EjectCD: false,
Logger: types.KairosLogger{
Logger: zerolog.Logger{},
},
Fs: &vfs.osfs{}, // p1
Mounter: &mount.Mounter{},
Runner: &v1.RealRunner{ // p2
Logger: &types.KairosLogger{
Logger: zerolog.Logger{},
},
},
Syscall: &v1.RealSyscall{},
CloudInitRunner: &cloudinit.YipCloudInitRunner{},
ImageExtractor: v1.OCIImageExtractor{},
Client: &http.Client{},
Platform: &v1.Platform{
OS: "linux",
Arch: "x86_64",
GolangArch: "amd64",
},
Cosign: false,
Verify: false,
CosignPubKey: "",
Arch: "x86_64",
SquashFsCompressionConfig: []string{},
SquashFsNoCompression: true,
UkiMaxEntries: 3,
}
2024-04-24T20:21:10Z INF Setting image size to 1063Mb
2024-04-24T20:21:10Z INF Setting OEM partition size to 64Mb
2024-04-24T20:21:10Z INF Setting recovery partition size to 2326Mb
2024-04-24T20:21:10Z INF Setting state partition size to 4189Mb
2024-04-24T20:21:10Z INF Setting persistent partition size to 0Mb
2024-04-24T20:21:10Z DBG Loaded install spec: &v1.InstallSpec{
Target: "",
Firmware: "bios",
PartTable: "gpt",
Partitions: v1.ElementalPartitions{
BIOS: &v1.Partition{
Name: "bios",
FilesystemLabel: "",
Size: 1,
FS: "",
Flags: []string{
"bios_grub",
},
MountPoint: "",
Path: "",
Disk: "",
},
EFI: nil,
OEM: &v1.Partition{
Name: "oem",
FilesystemLabel: "COS_OEM",
Size: 64,
FS: "ext4",
Flags: []string{}, // p0
MountPoint: "/run/cos/oem",
Path: "",
Disk: "",
},
Recovery: &v1.Partition{
Name: "recovery",
FilesystemLabel: "COS_RECOVERY",
Size: 2326,
FS: "ext4",
Flags: p0,
MountPoint: "/run/cos/recovery",
Path: "",
Disk: "",
},
State: &v1.Partition{
Name: "state",
FilesystemLabel: "COS_STATE",
Size: 4189,
FS: "ext4",
Flags: p0,
MountPoint: "/run/cos/state",
Path: "",
Disk: "",
},
Persistent: &v1.Partition{
Name: "persistent",
FilesystemLabel: "COS_PERSISTENT",
Size: 0,
FS: "ext4",
Flags: p0,
MountPoint: "/run/cos/persistent",
Path: "",
Disk: "",
},
},
ExtraPartitions: nil,
NoFormat: true,
Force: false,
CloudInit: nil,
Iso: "",
GrubDefEntry: "",
Tty: "tty1",
Reboot: false,
PowerOff: false,
ExtraDirsRootfs: nil,
Active: v1.Image{
File: "/run/cos/state/cOS/active.img",
Label: "COS_ACTIVE",
Size: 1063,
FS: "ext2",
Source: &v1.ImageSource{},
MountPoint: "/run/cos/active",
LoopDevice: "",
},
Recovery: v1.Image{
File: "/run/cos/recovery/cOS/recovery.img",
Label: "COS_SYSTEM",
Size: 1063,
FS: "ext2",
Source: &v1.ImageSource{},
MountPoint: "",
LoopDevice: "",
},
Passive: v1.Image{
File: "/run/cos/state/cOS/passive.img",
Label: "COS_PASSIVE",
Size: 1063,
FS: "ext2",
Source: &v1.ImageSource{},
MountPoint: "",
LoopDevice: "",
},
GrubConf: "/etc/cos/grub.cfg",
}
2024-04-24T20:21:10Z DBG Cloud-init paths set to [/system/oem /oem/ /usr/local/cloud-config/ /tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:10Z DBG Failed creating cloud-init config path: /tmp/kairos-install-config-xxx.yaml223218110 mkdir /tmp/kairos-install-config-xxx.yaml223218110: not a directory
2024-04-24T20:21:10Z INF Running stage: kairos-install.pre.before
2024-04-24T20:21:10Z INF Processing stage step 'Create partitions'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:11Z INF Command output: Setting name!
partNum is 0
The operation has completed successfully.
2024-04-24T20:21:11Z INF Creating COS_OEM partition
2024-04-24T20:21:12Z INF Creating COS_RECOVERY partition
2024-04-24T20:21:14Z INF Creating COS_STATE partition
2024-04-24T20:21:15Z INF Creating COS_PERSISTENT partition
2024-04-24T20:21:16Z INF Extending last partition to max space
2024-04-24T20:21:17Z ERR Failed growing partition: NOCHANGE: partition 5 is size 113364959. it cannot be grown
failed to run growpart /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0 5: exit status 1
2024-04-24T20:21:17Z ERR NOCHANGE: partition 5 is size 113364959. it cannot be grown
2024-04-24T20:21:17Z ERR failed to run growpart /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0 5: exit status 1
2024-04-24T20:21:17Z INF Processing stage step 'Create partitions'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:18Z INF Command output: Setting name!
partNum is 0
The operation has completed successfully.
2024-04-24T20:21:18Z INF Creating COS_OEM partition
2024-04-24T20:21:19Z INF Creating COS_RECOVERY partition
2024-04-24T20:21:21Z INF Creating COS_STATE partition
2024-04-24T20:21:22Z INF Creating COS_PERSISTENT partition
2024-04-24T20:21:24Z INF Extending last partition to max space
2024-04-24T20:21:24Z ERR Failed growing partition: NOCHANGE: partition 5 is size 113364959. it cannot be grown
failed to run growpart /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0 5: exit status 1
2024-04-24T20:21:24Z ERR NOCHANGE: partition 5 is size 113364959. it cannot be grown
2024-04-24T20:21:24Z ERR failed to run growpart /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0 5: exit status 1
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre.before'
2024-04-24T20:21:24Z INF Running stage: kairos-install.pre
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre'
2024-04-24T20:21:24Z INF Running stage: kairos-install.pre.after
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre.after'
2024-04-24T20:21:24Z INF Running stage: kairos-install.pre.before
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre.before'
2024-04-24T20:21:24Z INF Running stage: kairos-install.pre
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre'
2024-04-24T20:21:24Z INF Running stage: kairos-install.pre.after
2024-04-24T20:21:24Z INF Done executing stage 'kairos-install.pre.after'
2024-04-24T20:21:24Z INF NoFormat is true, skipping format and partitioning
2024-04-24T20:21:24Z INF Checking for active deployment
2024-04-24T20:21:24Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:25Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:26Z INF No target device specified, using pre-configured device: /dev/sda
2024-04-24T20:21:26Z INF Mounting disk partitions
2024-04-24T20:21:26Z DBG Mounting partition COS_OEM
2024-04-24T20:21:26Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:26Z DBG Mounting partition COS_PERSISTENT
2024-04-24T20:21:26Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:26Z DBG Mounting partition COS_RECOVERY
2024-04-24T20:21:26Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:26Z DBG Mounting partition COS_STATE
2024-04-24T20:21:26Z DBG Running cmd: 'udevadm settle'
2024-04-24T20:21:26Z INF Running before-install hook
2024-04-24T20:21:26Z DBG Cloud-init paths set to [/system/oem /oem/ /usr/local/cloud-config/ /tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:26Z DBG Failed creating cloud-init config path: /tmp/kairos-install-config-xxx.yaml223218110 mkdir /tmp/kairos-install-config-xxx.yaml223218110: not a directory
2024-04-24T20:21:26Z INF Running stage: before-install.before
2024-04-24T20:21:26Z INF Done executing stage 'before-install.before'
2024-04-24T20:21:26Z INF Running stage: before-install
2024-04-24T20:21:26Z INF Done executing stage 'before-install'
2024-04-24T20:21:26Z INF Running stage: before-install.after
2024-04-24T20:21:26Z INF Done executing stage 'before-install.after'
2024-04-24T20:21:26Z INF Running stage: before-install.before
2024-04-24T20:21:26Z INF Done executing stage 'before-install.before'
2024-04-24T20:21:26Z INF Running stage: before-install
2024-04-24T20:21:26Z INF Done executing stage 'before-install'
2024-04-24T20:21:26Z INF Running stage: before-install.after
2024-04-24T20:21:26Z INF Done executing stage 'before-install.after'
2024-04-24T20:21:26Z INF Creating file system image /run/cos/state/cOS/active.img with size 1063Mb
2024-04-24T20:21:26Z DBG Running cmd: 'mkfs.ext2 -L COS_ACTIVE /run/cos/state/cOS/active.img'
2024-04-24T20:21:26Z DBG Mounting image COS_ACTIVE
2024-04-24T20:21:26Z DBG Running cmd: 'losetup --show -f /run/cos/state/cOS/active.img'
2024-04-24T20:21:26Z INF Copying /run/rootfsbase source to /run/cos/active
2024-04-24T20:21:26Z INF Starting rsync...
2024-04-24T20:21:26Z DBG Running cmd: 'rsync --progress --partial --human-readable --archive --xattrs --acls --exclude=/mnt --exclude=/proc --exclude=/sys --exclude=/dev --exclude=/tmp --exclude=/host --exclude=/run /run/rootfsbase/ /run/cos/active/'
2024-04-24T20:21:31Z DBG Syncing data...
2024-04-24T20:21:31Z INF Finished syncing
2024-04-24T20:21:31Z INF Finished copying /run/rootfsbase into /run/cos/active
2024-04-24T20:21:31Z INF List of cloud inits to copy: [/tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:31Z INF Starting copying cloud config file /tmp/kairos-install-config-xxx.yaml223218110 to /run/cos/oem/90_custom.yaml
2024-04-24T20:21:31Z INF Finished copying cloud config file /tmp/kairos-install-config-xxx.yaml223218110 to /run/cos/oem/90_custom.yaml
2024-04-24T20:21:31Z INF Installing GRUB..
2024-04-24T20:21:31Z DBG Running grub with the following args: [--root-directory=/run/cos/active --boot-directory=/run/cos/state --target=i386-pc /dev/sda]
2024-04-24T20:21:31Z DBG Running cmd: '/usr/sbin/grub2-install --root-directory=/run/cos/active --boot-directory=/run/cos/state --target=i386-pc /dev/sda'
2024-04-24T20:21:32Z INF Grub install to device /dev/sda complete
2024-04-24T20:21:32Z INF Using grub config dir /run/cos/active/etc/cos/grub.cfg
2024-04-24T20:21:32Z INF Copying grub contents from /run/cos/active/etc/cos/grub.cfg to /run/cos/state/grub2/grub.cfg
2024-04-24T20:21:32Z DBG Extra mounts: map[/run/cos/oem:/oem /run/cos/persistent:/usr/local]
2024-04-24T20:21:32Z DBG Mounting /dev to chroot
2024-04-24T20:21:32Z DBG Mounted /dev to /run/cos/active/dev
2024-04-24T20:21:32Z DBG Mounting /dev/pts to chroot
2024-04-24T20:21:32Z DBG Mounted /dev/pts to /run/cos/active/dev/pts
2024-04-24T20:21:32Z DBG Mounting /proc to chroot
2024-04-24T20:21:32Z DBG Mounted /proc to /run/cos/active/proc
2024-04-24T20:21:32Z DBG Mounting /sys to chroot
2024-04-24T20:21:32Z DBG Mounted /sys to /run/cos/active/sys
2024-04-24T20:21:32Z DBG Mounting /run/cos/oem to chroot
2024-04-24T20:21:32Z DBG Mounted /run/cos/oem to /run/cos/active/oem
2024-04-24T20:21:32Z DBG Mounting /run/cos/persistent to chroot
2024-04-24T20:21:32Z DBG Mounted /run/cos/persistent to /run/cos/active/usr/local
2024-04-24T20:21:32Z DBG Running cmd: 'setfiles -c /etc/selinux/targeted/policy/policy.33 -e /dev -e /proc -e /sys -F /etc/selinux/targeted/contexts/files/file_contexts /'
2024-04-24T20:21:36Z DBG SELinux setfiles output:
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/usr/local from chroot
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/oem from chroot
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/sys from chroot
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/proc from chroot
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/dev/pts from chroot
2024-04-24T20:21:36Z DBG Unmounting /run/cos/active/dev from chroot
2024-04-24T20:21:36Z DBG Extra mounts: map[/run/cos/oem:/oem /run/cos/persistent:/usr/local]
2024-04-24T20:21:36Z DBG Mounting /dev to chroot
2024-04-24T20:21:36Z DBG Mounted /dev to /run/cos/active/dev
2024-04-24T20:21:36Z DBG Mounting /dev/pts to chroot
2024-04-24T20:21:36Z DBG Mounted /dev/pts to /run/cos/active/dev/pts
2024-04-24T20:21:36Z DBG Mounting /proc to chroot
2024-04-24T20:21:36Z DBG Mounted /proc to /run/cos/active/proc
2024-04-24T20:21:36Z DBG Mounting /sys to chroot
2024-04-24T20:21:36Z DBG Mounted /sys to /run/cos/active/sys
2024-04-24T20:21:36Z DBG Mounting /run/cos/oem to chroot
2024-04-24T20:21:36Z DBG Mounted /run/cos/oem to /run/cos/active/oem
2024-04-24T20:21:36Z DBG Mounting /run/cos/persistent to chroot
2024-04-24T20:21:36Z DBG Mounted /run/cos/persistent to /run/cos/active/usr/local
2024-04-24T20:21:36Z INF Running after-install-chroot hook
2024-04-24T20:21:36Z DBG Cloud-init paths set to [/system/oem /oem/ /usr/local/cloud-config/ /tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:36Z INF Running stage: after-install-chroot.before
2024-04-24T20:21:36Z INF Done executing stage 'after-install-chroot.before'
2024-04-24T20:21:36Z INF Running stage: after-install-chroot
2024-04-24T20:21:36Z INF Processing stage step 'Create data directories'. ( commands: 3, files: 0, ... )
2024-04-24T20:21:36Z INF Command output: 2024-04-24 20:21:36 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0, Directory /var/lib/rancher/rke2 ******
2024-04-24 20:21:36 The /var/lib/rancher/rke2 directory already exists, skipping creation
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:36Z INF Command output: 2024-04-24 20:21:36 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0, Directory /run/k3s ******
2024-04-24 20:21:36 Creating the /run/k3s directory
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:36Z INF Command output: 2024-04-24 20:21:36 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0, Directory /var/lib/rancher/longhorn ******
2024-04-24 20:21:36 Creating the /var/lib/rancher/longhorn directory
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:36Z INF Processing stage step 'Format disks'. ( commands: 3, files: 0, ... )
2024-04-24T20:21:36Z INF Processing stage step 'Format disks'. ( commands: 3, files: 0, ... )
2024-04-24T20:21:36Z INF Command output: 2024-04-24 20:21:36 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0, Directory /var/lib/rancher/rke2 ******
2024-04-24 20:21:36 Status before format.
2024-04-24 20:21:36 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0 with label RKE2
mke2fs 1.46.5 (30-Dec-2021)
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0 is apparently in use by the system; will not make a filesystem here!
sh: line 1: [[: 1[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:36Z INF Command output: 2024-04-24 20:21:36 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0, Directory /var/lib/rancher/rke2 ******
2024-04-24 20:21:36 Status before format.
2024-04-24 20:21:36 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0 with label RKE2
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 26214400 4k blocks and 6553600 inodes
Filesystem UUID: 5bdfccd6-0f43-4bc4-b7eb-1e933beca3e8
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872
Allocating group tables: done
Writing inode tables: done
Creating journal (131072 blocks): done
Writing superblocks and filesystem accounting information: done
2024-04-24 20:21:36 Status after format
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0: LABEL="RKE2" UUID="5bdfccd6-0f43-4bc4-b7eb-1e933beca3e8" TYPE="ext4"
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:36 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0, Directory /run/k3s ******
2024-04-24 20:21:36 Status before format.
2024-04-24 20:21:37 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0 with label K3S
mke2fs 1.46.5 (30-Dec-2021)
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0 is apparently in use by the system; will not make a filesystem here!
sh: line 1: [[: 1[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:36 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0, Directory /run/k3s ******
2024-04-24 20:21:36 Status before format.
2024-04-24 20:21:36 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0 with label K3S
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 39321600 4k blocks and 9830400 inodes
Filesystem UUID: 8c4e74ae-cf72-4ef5-bcac-8f24a4b9a026
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872
Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done
2024-04-24 20:21:37 Status after format
/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0: LABEL="K3S" UUID="8c4e74ae-cf72-4ef5-bcac-8f24a4b9a026" TYPE="ext4"
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:37 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0, Directory /var/lib/rancher/longhorn ******
2024-04-24 20:21:37 Status before format.
2024-04-24 20:21:37 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0 with label LONGHORN
mke2fs 1.46.5 (30-Dec-2021)
The file /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0 does not exist and no size was specified.
sh: line 1: [[: 1[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:37 ****** Option: format_disk, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0, Directory /var/lib/rancher/longhorn ******
2024-04-24 20:21:37 Status before format.
2024-04-24 20:21:37 Formatting disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0 with label LONGHORN
mke2fs 1.46.5 (30-Dec-2021)
The file /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0 does not exist and no size was specified.
sh: line 1: [[: 1[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Processing stage step 'Create data directories'. ( commands: 3, files: 0, ... )
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:37 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0, Directory /var/lib/rancher/rke2 ******
2024-04-24 20:21:37 The /var/lib/rancher/rke2 directory already exists, skipping creation
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:37 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0, Directory /run/k3s ******
2024-04-24 20:21:37 The /run/k3s directory already exists, skipping creation
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Command output: 2024-04-24 20:21:37 ****** Option: make_directory, Disk /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0, Directory /var/lib/rancher/longhorn ******
2024-04-24 20:21:37 The /var/lib/rancher/longhorn directory already exists, skipping creation
sh: line 1: [[: 0[0]: syntax error: invalid arithmetic operator (error token is "[0]")
2024-04-24T20:21:37Z INF Done executing stage 'after-install-chroot'
2024-04-24T20:21:37Z INF Running stage: after-install-chroot.after
2024-04-24T20:21:37Z INF Done executing stage 'after-install-chroot.after'
2024-04-24T20:21:37Z INF Running stage: after-install-chroot.before
2024-04-24T20:21:37Z INF Done executing stage 'after-install-chroot.before'
2024-04-24T20:21:37Z INF Running stage: after-install-chroot
2024-04-24T20:21:37Z INF Done executing stage 'after-install-chroot'
2024-04-24T20:21:37Z INF Running stage: after-install-chroot.after
2024-04-24T20:21:37Z INF Done executing stage 'after-install-chroot.after'
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/usr/local from chroot
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/oem from chroot
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/sys from chroot
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/proc from chroot
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/dev/pts from chroot
2024-04-24T20:21:37Z DBG Unmounting /run/cos/active/dev from chroot
2024-04-24T20:21:37Z DBG Unmounting image COS_ACTIVE
2024-04-24T20:21:38Z DBG Running cmd: 'losetup -d /dev/loop1'
2024-04-24T20:21:38Z INF Copying /run/cos/state/cOS/active.img source to /run/cos/recovery/cOS/recovery.img
2024-04-24T20:21:39Z INF Finished copying /run/cos/state/cOS/active.img into /run/cos/recovery/cOS/recovery.img
2024-04-24T20:21:39Z DBG Running cmd: 'tune2fs -L COS_SYSTEM /run/cos/recovery/cOS/recovery.img'
2024-04-24T20:21:40Z DBG Not unmounting image, doesn't look like mountpoint
2024-04-24T20:21:40Z INF Copying /run/cos/state/cOS/active.img source to /run/cos/state/cOS/passive.img
2024-04-24T20:21:41Z INF Finished copying /run/cos/state/cOS/active.img into /run/cos/state/cOS/passive.img
2024-04-24T20:21:41Z DBG Running cmd: 'tune2fs -L COS_PASSIVE /run/cos/state/cOS/passive.img'
2024-04-24T20:21:41Z DBG Not unmounting image, doesn't look like mountpoint
2024-04-24T20:21:41Z INF Running after-install hook
2024-04-24T20:21:41Z DBG Cloud-init paths set to [/system/oem /oem/ /usr/local/cloud-config/ /tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:41Z DBG Failed creating cloud-init config path: /tmp/kairos-install-config-xxx.yaml223218110 mkdir /tmp/kairos-install-config-xxx.yaml223218110: not a directory
2024-04-24T20:21:41Z INF Running stage: after-install.before
2024-04-24T20:21:41Z INF Done executing stage 'after-install.before'
2024-04-24T20:21:41Z INF Running stage: after-install
2024-04-24T20:21:41Z INF Processing stage step 'Mount state'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:41Z INF Command output:
2024-04-24T20:21:41Z INF Processing stage step 'Hook boot assessment grub configuration'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:41Z INF Command output:
2024-04-24T20:21:41Z INF Processing stage step 'Add boot assessment grub configuration'. ( commands: 0, files: 1, ... )
2024-04-24T20:21:41Z INF Processing stage step 'Grub branding'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:41Z INF Command output: '/etc/kairos/branding/grubmenu.cfg' -> '/tmp/mnt/STATE/grubmenu'
2024-04-24T20:21:41Z INF Processing stage step 'umount state'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:41Z INF Command output:
2024-04-24T20:21:41Z INF Done executing stage 'after-install'
2024-04-24T20:21:41Z INF Running stage: after-install.after
2024-04-24T20:21:41Z INF Done executing stage 'after-install.after'
2024-04-24T20:21:41Z INF Running stage: after-install.before
2024-04-24T20:21:41Z INF Done executing stage 'after-install.before'
2024-04-24T20:21:41Z INF Running stage: after-install
2024-04-24T20:21:41Z INF Done executing stage 'after-install'
2024-04-24T20:21:41Z INF Running stage: after-install.after
2024-04-24T20:21:41Z INF Done executing stage 'after-install.after'
2024-04-24T20:21:41Z DBG Not unmounting image, /run/cos/active doesn't look like mountpoint
2024-04-24T20:21:41Z INF Unmounting disk partitions
2024-04-24T20:21:41Z DBG Unmounting partition COS_STATE
2024-04-24T20:21:42Z DBG Unmounting partition COS_RECOVERY
2024-04-24T20:21:42Z DBG Unmounting partition COS_PERSISTENT
2024-04-24T20:21:42Z DBG Unmounting partition COS_OEM
2024-04-24T20:21:42Z DBG Running cmd: 'cat /proc/cmdline'
2024-04-24T20:21:42Z DBG Cloud-init paths set to [/system/oem /oem/ /usr/local/cloud-config/ /tmp/kairos-install-config-xxx.yaml223218110]
2024-04-24T20:21:42Z DBG Failed creating cloud-init config path: /tmp/kairos-install-config-xxx.yaml223218110 mkdir /tmp/kairos-install-config-xxx.yaml223218110: not a directory
2024-04-24T20:21:42Z INF Running stage: kairos-install.after.before
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after.before'
2024-04-24T20:21:42Z INF Running stage: kairos-install.after
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after'
2024-04-24T20:21:42Z INF Running stage: kairos-install.after.after
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after.after'
2024-04-24T20:21:42Z INF Running stage: kairos-install.after.before
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after.before'
2024-04-24T20:21:42Z INF Running stage: kairos-install.after
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after'
2024-04-24T20:21:42Z INF Running stage: kairos-install.after.after
2024-04-24T20:21:42Z INF Done executing stage 'kairos-install.after.after'
2024-04-24T20:21:42Z DBG Running GrubOptions hook
2024-04-24T20:21:42Z DBG Setting grub options: map[extra_cmdline:rd.immucore.debug]
2024-04-24T20:21:42Z DBG Finish GrubOptions hook
2024-04-24T20:21:42Z DBG Running BundlePostInstall hook
2024-04-24T20:21:42Z DBG Finish BundlePostInstall hook
2024-04-24T20:21:42Z DBG Running CustomMounts hook
2024-04-24T20:21:42Z DBG Finish CustomMounts hook
2024-04-24T20:21:42Z DBG Running CopyLogs hook
2024-04-24T20:21:42Z DBG Copying logs to persistent partition
2024-04-24T20:21:42Z INF Starting rsync...
2024-04-24T20:21:42Z DBG Running cmd: 'rsync --progress --partial --human-readable --archive --xattrs --acls /var/log/ /run/cos/persistent/.state/var-log.bind/'
2024-04-24T20:21:42Z INF Finished syncing
2024-04-24T20:21:42Z DBG Logs copied to persistent partition
2024-04-24T20:21:42Z DBG Finish CopyLogs hook
2024-04-24T20:21:42Z DBG Running Lifecycle hook
2024-04-24T20:21:42Z DBG Finish Lifecycle hook
Multiple directories get scanned when the kairos-agent
runs: https://github.com/kairos-io/kairos-agent/blob/2b99bf045becc9c389602a6fcace0284afc4b8ce/pkg/constants/constants.go#L168
The files in those directories are filtered by yaml extension and valid header and they are merged into one config. You can see the result of that merge at the beginning of the installation logs.
In that merged struct, I see this:
"after-install-chroot": []interface {}{
collector.Config{
"commands": []interface {}{
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0\" \"/var/lib/rancher/rke2\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0\" \"/run/k3s\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
"make_disk.sh \"make_directory\" \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0\" \"/var/lib/rancher/longhorn\" 2>&1 | tee -a /var/log/sel/make_disk.log && if [[ $PIPESTATUS[0] -ne 0 ]]; then exit 1; fi",
},
"name": "Create data directories",
},
which seems to originate in the Aurora boot config you attached. This means, it's being read.
This block exists too:
"kairos-install.pre.before": []interface {}{
collector.Config{
"commands": []interface {}{
"parted --script --machine -- \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\" mklabel gpt\n# Legacy bios\nsgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 \"/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0\"\n",
},
So, although there is a lot happening here and I could easily miss something important, it seems that both configs are merged in the final one.
The installation output even shows partitions being created:
2024-04-24T20:21:10Z INF Processing stage step 'Create partitions'. ( commands: 1, files: 0, ... )
2024-04-24T20:21:11Z INF Command output: Setting name!
partNum is 0
The operation has completed successfully.
(I'm not sure where the "Setting name!" text is coming from)
In the installation logs above there are some errors (not necessarily explaining the original issue). E.g.:
invalid arithmetic operator (error token is "[0]")
maybe you can tell where these are coming from?
(I'm not sure where the "Setting name!" text is coming from)
it's from the sgdisk
command:
sgdisk --new=1:2048:+1M --change-name=1:'bios' --typecode=1:EF02 "/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0"
Ohh, I see, the reason I couldn't find the cloud_init.yaml
from AuroraBoot is because it doesn't pull and store it. It pulls it during runtime.
From the logs: "config_url": "http://10.105.148.91:8090/_/file?name=other-1",
I found a super easy way of running the manual install without copying the file over.
Since it's already being pulled during any install just do this
echo "#cloud-config" > /tmp/config.yaml
kairos-agent manual-install /tmp/config.yaml 2>&1 | tee /tmp/out.log
The temp file is required or it won't run, but it doesn't really need anything in it.
Re: my question above.
The bigger surprise is that
cloud_init.yaml
being served from the target nodes vSphereguestinfo.userdata
is being ran even whenauto: false
is set. I assumed it would not, but it is.
Is that expected or a bug?
The auto: false
command only controls whether the installation of kairos will start automatically or not. It doesn't prevent stages from being run or configs from being parsed. That said, the installation in your case indeed started so it looks like a bug to me. Unless I don't understand the auto
setting either :D. @kairos-io/maintainers do you see any reason why the installation would start when auto
is set to false? Maybe Auroraboot somehow forces it? Through cmdline maybe? I'm just throwing ideas here.
umm, no, I cant understand why would the install auto start if the install.auto is set to false....
cmdline in aurora is not supposed to start the install either if the auto is set to false.
Maybe we got a bug around that?
Could be. I'll open another ticket for this since this one was about custom partitioning. Here: https://github.com/kairos-io/kairos/issues/2516
@sarg3nt I'm closing this. Let's move the auto: false
conversation to the new ticket.
Kairos version:
/kairos/rockylinux:9-core-amd64-generic-v2.4.3
CPU architecture, OS, and Version:
Linux lpul-vault-k8s-agent-2.vault.ad.selinc.com 5.14.0-362.8.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 8 17:36:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
device: auto
Kairos will install, reboot and install again infinitely, i.e. an install loop.no-format: true
and the boot disk is manually created as per the instructions then the OS installs, reboots and immediately gets stuck at a black screen with a flashing cursor in the top left cornerdevice: /dev/sda
then it will work sometimes but this is not stable as/dev/sda
can change from boot to boot due to async device assignments. See https://github.com/kairos-io/kairos/issues/2243To Reproduce
device: auto
in thecloud_init.yaml
Expected behavior Larger volumes than the boot volume should be supported.
This may require fixing the
device
bug as mentioned in https://github.com/kairos-io/kairos/issues/2243Logs Have not been able to attain logs due to failure.
Additional context
kairos/rockylinux:9-core-amd64-generic-v2.4.3
and add several customizations and add-ons but I've also tested from the basekairos/rockylinux:9-core-amd64-generic-v2.4.3
image and have gotten the same results.kairos/rockylinux:9-core-amd64-generic-v3.0.0-alpha3
with the same results.The
cloud_init.yaml
file for a custom formatted disk resulting in a blank screen after install.Even more stripped down YAML file without custom formatted disk resulting in an install loop.
As stated above, if I set
device: /dev/sda
it will work with some of the nodes and boot lock on others, which is not acceptable.