Closed sanmuny closed 10 months ago
Kiwi log: sles_kiwi_log.txt
rdsosreport.txt: rdsosreport.txt
The boot process on s390 and SLE is based on a user space grub solution which is triggered by the bootloader setting grub2_s390x_emu
. From a kiwi perspective this code hasn't changed, but I could imagine that the SLE side of things might have changed without us noticing.
However, with the provided information I won't be able to debug the complete chain and also don't have time to dig into it right now. My suggestion is you get in contact with Raymund Will and/or Mike Friesenegger from SUSE who are in charge of the userspace grub approach. I'm happy to join a conversation or debugging session but won't have time to setup a dev environment to debug s390 boot issues from scratch, sorry
I will use the information provided to verify the reported issue.
Thanks much
I have duplicated what @sanmuny has reported.
15 SP3 image built using:
15 SP4 image built using:
NOTE: I am using the kiwi packages included in SLES whereas @sanmuny has a newer kiwi version.
I am using https://github.com/mfriesenegger/docs/tree/master/suse-SLE15-Enterprise-JeOS-s390x for the image description. The devicepersistency="by-path" must be changed to "by-uuid" for the kiwi versions in 15 SP3 and 15 SP4. The repository information which points to the SUSE Customer Center is added using the script comment in JeOS.kiwi. The kiwi-ng command is building the "kvm" profile.
I have determined that /boot/zipl/initrd-5.*-default in the 15 SP4 is missing several kernel modules which includes virtio_blk and virtio_scsi. As a workaround, I can boot the VM into rescue mode running grub2-install and then the VM boots properly.
Here is another workaround that came to as I was writing the previous comment. This worked for me but may not be recommended by @schaefi.
add_drivers+=" loop virtio_blk cdrom dm-mod sd_mod sr_mod virtio_scsi "
Hi Mike, thanks for hunting it down to the point of the issue. Adding the custom dracut.conf.d/xxx.conf
file is an acceptable solution and maintainable as part of the image description in a clean way. However, I was wondering why these kernel modules were not included into the initrd at the time dracut was called. When dracut is called it usually includes all modules that are in use by the time of the call. As we need loop, sd_mod and others to even build the image I wonder why these are missing ? In addition dracut has a set of default modules which are included always. To my knowledge basic scsi and loop support belongs to these modules.
So are you aware of changes on the dracut end with regards to SLE15 on s390x ? iirc this used to work before
Looking at dracut /usr/lib/dracut/dracut-init.sh
there is a method named is_qemu_virtualized
. Only in case this method returns a true value the inclusion of modules like virtio_blk and others is performed. Maybe at this point we are hitting an issue on s390, just guessing though
@sanmuny Could you apply the changes provided by @mfriesenegger to your image description and rebuild the image ? If this fixes the issue we know at least why this is happening. Thanks much
thanks @schaefi @mfriesenegger , I have updated the config.sh like this:
echo 'add_drivers+=" loop virtio_blk cdrom dm-mod sd_mod sr_mod virtio_scsi "' >> /etc/dracut.conf.d/15-kiwi-test.conf
Previous "/dev/disk/by-uuid/xxx does not exist" issue was gone, but the image is still failed to boot with following error message:
[ 1.340215][ T1] List of all partitions:
[ 1.340264][ T1] No filesystem could mount root, tried:
[ 1.340265][ T1]
[ 1.340270][ T1] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 1.340283][ T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.21-150400.24.92-default #1 SLE15-SP4 291b58a09545321927bc06773b511ee5b38b28d5
[ 1.340294][ T1] Hardware name: IBM 3907 ZR1 Z06 (KVM/Linux)
[ 1.340300][ T1] Call Trace:
[ 1.340306][ T1] [<000000001010d582>] dump_stack_lvl+0x62/0x88
[ 1.340333][ T1] [<000000001010ac3a>] panic+0x122/0x310
[ 1.340335][ T1] [<00000000108fefba>] mount_block_root+0x382/0x3a0
[ 1.340352][ T1] [<00000000108ff1ce>] prepare_namespace+0x16e/0x1a8
[ 1.340355][ T1] [<00000000108fe9c6>] kernel_init_freeable+0x2e6/0x300
[ 1.340358][ T1] [<00000000101107ae>] kernel_init+0x2e/0x168
[ 1.340361][ T1] [<000000000f847144>] __ret_from_fork+0x3c/0x58
[ 1.340371][ T1] [<000000001011f91a>] ret_from_fork+0xa/0x30
Kernel boot logs: log.txt
Thanks your change looks good
[ 0.176846][ T0] setup: Linux is running under KVM in 64-bit mode
ok
[ 1.340264][ T1] No filesystem could mount root, tried: [ 1.340265][ T1]
To me this looks like that this kernel does not see a single storage device.
What I'm missing in the kernel log is any information about an initrd. Normally there is data like
Trying to unpack rootfs image as initramfs...
This is not present in your kernel log which brings me to the assumption that there is no initrd loaded at all which also would explain why no devices are present
Can you share the details of the bootloader setup and if you can confirm that there is an initrd loaded ?
Thanks
Can you share the details of the bootloader setup and if you can confirm that there is an initrd loaded ?
@sanmuny are you willing to share your kiwi definition by compressed up the directory that includes the kiwi xml as well as other files and directories so we can review this? You can send the compressed file directly to me if you do want to share publicly.
Here are additional steps you can use to look inside of the qcow2 file.
modprobe nbd max_part=8
qemu-nbd --connect=/dev/nbd0 /path/to/qcow2/file.qcow2
parted /dev/nbd0 print
mount /dev/nbd0p1 /mnt
ls /mnt
ls -l /mnt/boot
ls -l /mnt/boot/zipl
umount /mnt
qemu-nbd --disconnect /dev/nbd0
modprobe -r
Thank you @mfriesenegger @schaefi . Here are the output of above commands and kiwi description files.
Kiki description files: config.sh.txt minimal.kiwi.txt
$ parted /dev/nbd0 print
Model: Unknown (unknown)
Disk /dev/nbd0: 107GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 107GB 107GB primary xfs
$ ls -l /mnt/boot/zipl/
total 22388
-rw-r--r-- 1 root root 0 Oct 9 09:44 active_devices.txt
-rw------- 1 root root 135680 Oct 19 09:27 bootmap
-rw-r--r-- 1 root root 1965 Oct 19 09:27 config
lrwxrwxrwx 1 root root 34 Oct 19 09:24 image -> image-5.14.21-150400.24.92-default
-rw-r--r-- 1 root root 8221248 Oct 5 14:31 image-5.14.21-150400.24.92-default
lrwxrwxrwx 1 root root 35 Oct 19 09:27 initrd -> initrd-5.14.21-150400.24.92-default
-rw------- 1 root root 14553111 Oct 19 09:27 initrd-5.14.21-150400.24.92-default
$ cat /mnt/boot/zipl/config
## This file was written by 'grub2-install/grub2-zipl-setup'
## filling '/etc/default/zipl2grub.conf.in' as template
## with values from '/etc/default/grub'.
## In-place modifications will eventually go missing!
[defaultboot]
defaultmenu = menu
[grub2]
target = /boot/zipl
ramdisk = /boot/zipl/initrd,0x2000000
image = /boot/zipl/image
parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb initgrub quiet splash=silent plymouth.enable=0 "
[grub2-mem1G]
target = /boot/zipl
image = /boot/zipl/image
ramdisk = /boot/zipl/initrd,0x2000000
parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb initgrub quiet splash=silent plymouth.enable=0 mem=1G "
[skip-grub2]
target = /boot/zipl
ramdisk = /boot/zipl/initrd,0x2000000
image = /boot/zipl/image
parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb "
#@
#@[grub2-previous]
#@ target = /boot/zipl
#@ image = /boot/zipl/image.prev
#@ ramdisk = /boot/zipl/initrd.prev,0x2000000
#@ parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb initgrub quiet splash=silent plymouth.enable=0 "
#@
#@[grub2-mem1G-previous]
#@ target = /boot/zipl
#@ image = /boot/zipl/image.prev
#@ ramdisk = /boot/zipl/initrd.prev,0x2000000
#@ parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb initgrub quiet splash=silent plymouth.enable=0 mem=1G "
#@
#@[skip-grub2-previous]
#@ target = /boot/zipl
#@ image = /boot/zipl/image.prev
#@ ramdisk = /boot/zipl/initrd.prev,0x2000000
#@ parameters = "root=UUID=54b4237d-fb96-456c-aa2b-a424c31396d8 hvc_iucv=8 TERM=dumb "
:menu
target = /boot/zipl
timeout = 60
default = 1
prompt = 0
secure = 0
1 = grub2
2 = skip-grub2
3 = grub2-mem1G
#@ 4 = grub2-previous
#@ 5 = skip-grub2-previous
#@ 6 = grub2-mem1G-previous
@schaefi
Looking at dracut
/usr/lib/dracut/dracut-init.sh
there is a method namedis_qemu_virtualized
. Only in case this method returns a true value the inclusion of modules like virtio_blk and others is performed. Maybe at this point we are hitting an issue on s390, just guessing though
My kiwi builder is running as a KVM VM on s390x. The output of the commands in dracut-init.sh are:
kiwibldr-kvm:~ # systemd-detect-virt --vm
kvm
kiwibldr-kvm:~ # echo $?
0
I can give you access to the VM if you want to dig deeper into why the kernel modules are not being added.
@sanmuny
I am sorry that it has taken a long time to return to this issue.
I successfully built and booted a qcow image using the config.sh
and minimal.kiwi
files that you provided.
The only change that I made was the repository xml tags. Your mimimal.xml points to a larger number of repositories then what I used. Can you replace your repository tags with with the following, rebuilt and see if it boots?
<repository type="rpm-md">
<source path="file:///mnt/Module-Basesystem/"/>
</repository>
<repository type="rpm-md">
<source path="file:///mnt/Module-Desktop-Applications/"/>
</repository>
<repository type="rpm-md">
<source path="file:///mnt/Module-Development-Tools/"/>
</repository>
<repository type="rpm-md">
<source path="file:///mnt/Module-Public-Cloud/"/>
</repository>
<repository type="rpm-md">
<source path="file:///mnt/Module-Server-Applications/"/>
</repository>
<repository type="rpm-md">
<source path="file:///mnt/Product-SLES/"/>
</repository>
@sanmuny one more question.
Did you download SLES 15 SP4 ISO and mount it to /mnt before building the image?
Thank you @mfriesenegger .
I tried your repository but the same error happened. Could you share your config.sh and minimal.kiwi? BTW, I set mapper to kpartx in my /etc/kiwi.yml. I'm not sure if it is the cause.
mapper:
# Specify tool to use for creating partition maps
# Possible values are: kpartx and partx
- part_mapper: kpartx
17:07:28
[ 1.333670][ T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.21-150400.24.100-default #1 SLE15-SP4 c3506b2009b29a4b0705edcff6d85f566160a9ea
17:07:28
[ 1.333678][ T1] Hardware name: IBM 3907 ZR1 Z06 (KVM/Linux)
17:07:28
[ 1.333683][ T1] Call Trace:
17:07:28
[ 1.333691][ T1] [<0000000028b1dbe2>] dump_stack_lvl+0x62/0x88
17:07:28
[ 1.333714][ T1] [<0000000028b1b29a>] panic+0x122/0x310
17:07:28
[ 1.333716][ T1] [<000000002930efba>] mount_block_root+0x382/0x3a0
17:07:28
[ 1.333737][ T1] [<000000002930f1ce>] prepare_namespace+0x16e/0x1a8
17:07:28
[ 1.333740][ T1] [<000000002930e9c6>] kernel_init_freeable+0x2e6/0x300
17:07:28
[ 1.333743][ T1] [<0000000028b20e0e>] kernel_init+0x2e/0x168
17:07:28
[ 1.333747][ T1] [<0000000028257144>] __ret_from_fork+0x3c/0x58
17:07:28
[ 1.333750][ T1] [<0000000028b2ff7a>] ret_from_fork+0xa/0x30
Did you download SLES 15 SP4 ISO and mount it to /mnt before building the image?
Yes, I downloaded the ISO and mounted to /mnt directory.
# ls /mnt
ARCHIVES.gz COPYRIGHT gpg-pubkey-39db7c82-5f68629b.asc Module-Basesystem Module-Live-Patching Module-Transactional-Server README .treeinfo
boot COPYRIGHT.de gpg-pubkey-50a3dd1c-50f35137.asc Module-Containers Module-Public-Cloud Module-Web-Scripting repodata
ChangeLog docu INDEX.gz Module-Desktop-Applications Module-Python2 Product-HA susehmc.ins
CHECKSUMS glump ls-lR.gz Module-Development-Tools Module-SAP-Applications Product-SLES suse.ins
CHECKSUMS.asc gpg-pubkey-307e3d54-5aaa90a5.asc media.1 Module-Legacy Module-Server-Applications Product-SUSE-Manager-Server-4.2 suse_ptf_key.asc
Hello @sanmuny,
I downloaded SLE-15-SP4-Full-s390x-GM-Media1.iso and mounted it to /mnt so I could use the config.sh and minimal.kiwi files that you provided without modifications. The image booted successfully on my system.
Your comment about - part_mapper: kpartx
reminded me that you are running a newer version of kiwi. I am running the version included in SLES 15 SP4 which is python3-kiwi-9.24.43-150100.3.62.1.s390x
. An idea is to downgrade your kiwi packages and verify that /etc/kiwi.yml does not have the part_mapper
option.
@mfriesenegger I downgraded the kiwi-ng to v9.24.43, the image still failed to boot due to the same issue with mapper setting to kpartx or not.
# kiwi-ng -v
KIWI (next generation) version 9.24.43
@sanmuny Thank you for downgrading the kiwi version.
Based on the tests in my environment, you should be able to build and boot an image. I believe this issue is not with kiwi but something in your environment. Please describe your build environment as well as the environment where the image is being booted.
Here is my environment to compare.
Hello @mfriesenegger , here's my build environment:
BTW, did you run following the commands to update packages in config.sh? I'm not sure if this is the root cause.
uuidgen > /etc/.buildid
#Update packages
FTP3_USER=FTP3_USER_PLACEHOLDER
FTP3_PASSWORD=FTP3_PASSWORD_PLACEHOLDER
wget --user $FTP3_USER --password $FTP3_PASSWORD -O ibm-zypper.sh ftp://ftp3.linux.ibm.com/suse/ibm-zypper.sh
chmod 755 ./ibm-zypper.sh
yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n refresh
yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n up
yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n clean
rm -rf ./ibm-zypper.sh
rm -rf ./ibm-zypper.log
echo 'Package updated'
Hello @sanmuny
here's my build environment:
- KVM Host in LPAR: Ubuntu 22.04
- Kiwi Builder VM: SLES 15 SP4
- Test image: booting a different VM on the same KVM Host as the builder
Your environment is similar to mine. This only difference is the KVM Host.
BTW, did you run following the commands to update packages in config.sh? I'm not sure if this is the root cause.
uuidgen > /etc/.buildid #Update packages FTP3_USER=FTP3_USER_PLACEHOLDER FTP3_PASSWORD=FTP3_PASSWORD_PLACEHOLDER wget --user $FTP3_USER --password $FTP3_PASSWORD -O ibm-zypper.sh ftp://ftp3.linux.ibm.com/suse/ibm-zypper.sh chmod 755 ./ibm-zypper.sh yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n refresh yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n up yes | ./ibm-zypper.sh --ftp3user=$FTP3_USER --ftp3pass=$FTP3_PASSWORD -n clean rm -rf ./ibm-zypper.sh rm -rf ./ibm-zypper.log echo 'Package updated'
I have never run any of the commands you listed including ibm-zypper.sh because it is IBM internal. It is hard to say if this is the root cause because I do not know what ibm-zypper.sh does.
I propose the next step is to schedule a desktop share and web call so you can show your build environment and demonstrate a build. I will chat with you via Slack to schedule.
Hi @mfriesenegger , I think I found the cause of this issue. I noticed that the initrd file was not generated successfully after upgrading RPM packages. I'll try to verify this by removing the upgrade process from kiwi build.
$ file initrd
initrd: broken symbolic link to initrd-5.14.21-150400.24.100-default
hi @mfriesenegger @schaefi , I verified that it did caused by the update process in config.sh. Thank you.
@sanmuny Great, thanks for letting us know and for the verification steps. I'm closing this one now
Problem description
The qcow2 image of sles 15 sp4 failed to boot and enter emergency mode on s390x platform. Same kiwi description and version works for sles 15 sp3.
Expected behaviour
The newly built image should boot successfully.
Steps to reproduce the behaviour
KIWI description:
OS and Software information