Closed TheChymera closed 6 years ago
So to bracket the problem:
boot:
prompt in the screenshot https://user-images.githubusercontent.com/950524/36382125-2757f1e4-1588-11e8-85cd-6cbfab3af1c4.png seems to imply that it can. If the bootloader can load additional stages, it probably means that 4.a 4.b do not apply, since it can load them from the btrfs-partitoinThe further stages can not: a) find the relevant partition, b) read the btrf-filesystem c) find the configuration file
This is all ext4 now, so I guess we can exclude b) ?
Probably. Sometimes, the symbolic link /boot/boot -> . exists. Is this true for the image?
Apparently not:
bs1 ~ # ls /mnt/debug/boot/ -lah
total 12M
drwxr-xr-x 3 root root 4.0K Feb 20 01:30 .
drwxr-xr-x 21 root root 4.0K Feb 20 01:30 ..
-rw-r--r-- 1 root root 0 Feb 6 21:51 .keep
-rw-r--r-- 1 root root 2.3M Feb 20 01:30 System.map-4.9.76-gentoo-r1
-rw-r--r-- 1 root root 71K Feb 20 01:30 config-4.9.76-gentoo-r1
lrwxrwxrwx 1 root root 23 Feb 20 01:30 initramfs -> initramfs-4.9.76-gentoo
-rw------- 1 root root 4.2M Feb 20 01:30 initramfs-4.9.76-gentoo
drwxr-xr-x 2 root root 4.0K Feb 20 01:30 syslinux
lrwxrwxrwx 1 root root 24 Feb 20 01:30 vmlinuz -> vmlinuz-4.9.76-gentoo-r1
-rw-r--r-- 1 root root 4.8M Feb 20 01:30 vmlinuz-4.9.76-gentoo-r1
oh wow. Create it and try again.
Iirc. this symbolic link is part of the stage3-tarball. I have no Idea how it got lost.
Can you check afterwords whether the link exists in roots/<id>/root/boot/
?
ln -s . /boot/boot
should do the trick
uhm, I'm in the root home - you mean:
ln -s /mnt/debug/boot/ /mnt/debug/boot/boot
?
Nope. ln is quite stupid and doesn't alter the path you give it first but writes it directly into the inode. i.e.
cd /path/that/is/completely/irrelevant
ln -s . /path1/link
realpath /path1/link -> /path1/
mv /path1/link /path2/link
realpath /path2/link -> /path2/
If you want ln -s
to behave as one might expect, ln -s -r
is the way to go, i.e.
cd /path/that/is/relevant
ln -s -r . /path1/link
realpath /path1/link -> /path/that/is/relevant
But you can, of course, cd into /boot
if it makes you feel more comfortable
Dear god :-/
so then:
cd /mnt/debug/boot
ln -s . /boot/boot
?
Rather
cd /mnt/debug/boot
ln -s . boot
PSA: I'm going to bed now. Godspeed to you.
ok, cool stuff, this worked.
What's a bit puzzling is that the current openstack system which I am running (as well as systems based on images preceding your project) lack this symlink. Would you recomend we stop digging and just integrate the symlink creation into the build process? where?
If anywhere then in the 40-generate_bootchain.sh
But I'd rather find out why the symlink disappears, and I suspect the culprit being a call to cp
or rsync
which doesn't sync symlinks.
As I said, could you check whether the symlink exists in roots/<id>/root/boot
?
looks empty:
bs1 /usr/share/gebuilder/roots/stemgentoo # ls root/boot/ -lah
total 8.0K
drwxr-xr-x 2 root root 4.0K Feb 6 21:51 .
drwxr-xr-x 20 root root 4.0K Feb 19 22:14 ..
-rw-r--r-- 1 root root 0 Feb 6 21:51 .keep
Also, you say that the symlink disappears, but I cannot see where it's supposed to be created?
bs1 /usr/share/gebuilder # ag boot config/ scripts/ utils/
scripts/openstack_image/default/35-setup_openstack.sh.chroot
14:rc-update add dhcpcd boot
34:pushd /boot/
scripts/openstack_image/stemgentoo/35-setup_openstack.sh.chroot
14:rc-update add dhcpcd boot
34:pushd /boot/
scripts/openstack_image/default/40-generate_bootchain.sh
8:mkdir ${OPENSTACK_IMG_MNT}/boot/syslinux
9:cp /usr/share/syslinux/{menu.c32,memdisk,libcom32.c32,libutil.c32} "${OPENSTACK_IMG_MNT}/boot/syslinux/"
12:extlinux --device="${OPENSTACK_IMG_LODEV}p1" --install "${OPENSTACK_IMG_MNT}/boot/syslinux/"
14:debug "Writing bootloader, booting from UUID $OPENSTACK_IMG_UUID"
15:cat <<-EOF > ${OPENSTACK_IMG_MNT}/boot/syslinux/syslinux.cfg
18: LINUX /boot/vmlinuz root=UUID=$OPENSTACK_IMG_UUID rootfstype=ext4 console=ttyS0,115200n8
19: INITRD /boot/initramfs
27:INITRAMFS="${OPENSTACK_IMG_MNT}/boot/initramfs-$KERNELVERSION"
30:ln -s "initramfs-$KERNELVERSION" "${OPENSTACK_IMG_MNT}/boot/initramfs"
scripts/openstack_image/stemgentoo/40-generate_bootchain.sh
8:mkdir ${OPENSTACK_IMG_MNT}/boot/syslinux
9:cp /usr/share/syslinux/{menu.c32,memdisk,libcom32.c32,libutil.c32} "${OPENSTACK_IMG_MNT}/boot/syslinux/"
12:extlinux --device="${OPENSTACK_IMG_LODEV}p1" --install "${OPENSTACK_IMG_MNT}/boot/syslinux/"
14:debug "Writing bootloader, booting from UUID $OPENSTACK_IMG_UUID"
15:cat <<-EOF > ${OPENSTACK_IMG_MNT}/boot/syslinux/syslinux.cfg
18: LINUX /boot/vmlinuz root=UUID=$OPENSTACK_IMG_UUID rootfstype=$OPENSTACK_FILESYSTEM console=ttyS0,115200n8
19: INITRD /boot/initramfs
27:INITRAMFS="${OPENSTACK_IMG_MNT}/boot/initramfs-$KERNELVERSION"
30:ln -s "initramfs-$KERNELVERSION" "${OPENSTACK_IMG_MNT}/boot/initramfs"
utils/openstack_kernel_nodocker.config
418:CONFIG_NO_BOOTMEM=y
450:# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
495:# CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
555:# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
1019:# CONFIG_ISCSI_BOOT_SYSFS is not set
2293:CONFIG_X86_VERBOSE_BOOTUP=y
utils/openstack_kernel.config
141:# CONFIG_RCU_EXPEDITE_BOOT is not set
416:CONFIG_NO_BOOTMEM=y
447:# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
492:# CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
553:# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
1262:# CONFIG_ISCSI_BOOT_SYSFS is not set
2521:CONFIG_X86_VERBOSE_BOOTUP=y
I checked, and no other system of mine has this symlink.
@Doeme highly interesting developments: the existence of boot/boot
is not what solves the issue and its absence is not what causes it, it seems there's an issue with running the script:
bs1 /usr/share/gebuilder/roots # cat stemgentoo/hooks/openstack_image/post/60-upload_image.sh
#!/bin/bash
OS_USER="???"
OS_PW="???"
OS_TENANT="???"
OS_IMGNAME="stemgentoo"
function gl(){
glance --os-username "$OS_USER" \
--os-password "$OS_PW" \
--os-tenant-name "$OS_TENANT" \
--os-auth-url https://cloud.s3it.uzh.ch:5000/v2.0 \
--os-image-api-version 2 "$@"
}
if [ -f "${ROOT}/../registry/openstack_image" ]
then
UUID="$(sed -n 's/|[[:blank:]]\+id[[:blank:]]\+|[[:blank:]]\+\([a-z0-9\-]\+\)[[:blank:]]\+|/\1/p' "${ROOT}/../registry/openstack_image")"
debug "Deleting old image with uuid $UUID"
gl image-delete "$UUID"
else
ensure_dir "${ROOT}/../registry/"
fi
debug "Uploading new image with name $OS_IMGNAME"
gl image-create --disk-format raw --container-format bare --name "$OS_IMGNAME" --file "$OPENSTACK_IMAGE" >"${ROOT}/../registry/openstack_image"
bs1 ~ # cat 60-upload_image.sh
#!/bin/bash
OS_USER="???"
OS_PW="???"
OS_TENANT="???"
OS_IMGNAME="stemgentoo"
function gl(){
glance --os-username "$OS_USER" \
--os-password "$OS_PW" \
--os-tenant-name "$OS_TENANT" \
--os-auth-url https://cloud.s3it.uzh.ch:5000/v2.0 \
--os-image-api-version 2 "$@"
}
if [ -f "${ROOT}/../registry/openstack_image" ]
then
UUID="$(sed -n 's/|[[:blank:]]\+id[[:blank:]]\+|[[:blank:]]\+\([a-z0-9\-]\+\)[[:blank:]]\+|/\1/p' "${ROOT}/../registry/openstack_image")"
#debug "Deleting old image with uuid $UUID"
gl image-delete "$UUID"
else
echo "l2l"
#ensure_dir "${ROOT}/../registry/"
fi
#debug "Uploading new image with name $OS_IMGNAME"
echo "Myecho: $OPENSTACK_IMAGE"
gl image-create --disk-format raw --container-format bare --name "$OS_IMGNAME" --file "$OPENSTACK_IMAGE" >"${ROOT}/../registry/openstack_image"
bs1 ~ # OPENSTACK_IMAGE=/usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220 ROOT=/usr/share/gebuilder/roots/stemgentoo/root ./60-upload_image.sh
bs1 /usr/share/gebuilder/roots # glance --os-username "???" --os-password "???" --os-tenant-name "???" --os-auth-url https://cloud.s3it.uzh.ch:5000/v2.0 --os-image-api-version 2 image-create --disk-format raw --container-format bare --name "sg_test2" --file /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220 > stemgentoo/registry/openstack_image
I'm continuing to debug this, but input would be appreciated, since trial and error is quite slow..
Since the script did not change at all (except for some echos) I guess its related to the command-line variables. maybe echo the relevant variables at the beginning of the script and look into the logs
It seems something happens to the image in the build process after it is uploaded, something which makes it work again. @Doeme any ideas? Possibly the unmounts? 0.o
[...]
Executing openstack_image/stemgentoo/50-restore_root.sh
Ensuring /usr/share/gebuilder/roots/stemgentoo/root/../logs/openstack_image/ is a directory
executing scripts /usr/share/gebuilder/roots/stemgentoo/root/../hooks/openstack_image/post/60-upload_image.sh
Executing /usr/share/gebuilder/roots/stemgentoo/root/../hooks/openstack_image/post/60-upload_image.sh
Ensuring /usr/share/gebuilder/roots/stemgentoo/root/../logs/openstack_image/ is a directory
No image with an ID of 'b2fed0b6-f71b-4ede-a0de-01855599904d' exists.
Image is: /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
MD5SUM is : e7f18cc6483ced1368459e3aa95f6532 /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
Finished succesfully
Cleaning up
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/tmp"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/var/tmp/portage"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/sys"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/proc"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev/pts"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev"
executing umount /usr/share/gebuilder/roots/stemgentoo/root/../mnt
executing losetup -d /dev/loop2
roots/stemgentoo/hooks/openstack_image/chain
bs1 ~ # md5sum /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
7273a170e5c6f79fce4d3faddda706be /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
bs1 ~ # cat /usr/share/gebuilder/roots/stemgentoo/hooks/openstack_image/post/60-upload_image.sh
#!/bin/bash
OS_USER="???"
OS_PW="???"
OS_TENANT="???"
OS_IMGNAME="stemgentoo"
function gl(){
glance --os-username "$OS_USER" \
--os-password "$OS_PW" \
--os-tenant-name "$OS_TENANT" \
--os-auth-url https://cloud.s3it.uzh.ch:5000/v2.0 \
--os-image-api-version 2 "$@"
}
if [ -f "${ROOT}/../registry/openstack_image" ]
then
UUID="$(sed -n 's/|[[:blank:]]\+id[[:blank:]]\+|[[:blank:]]\+\([a-z0-9\-]\+\)[[:blank:]]\+|/\1/p' "${ROOT}/../registry/openstack_image")"
#debug "Deleting old image with uuid $UUID"
gl image-delete "$UUID" || true
else
echo "lala"
#ensure_dir "${ROOT}/../registry/"
fi
#debug "Uploading new image with name $OS_IMGNAME"
echo "Image is: ${OPENSTACK_IMAGE}"
MD5SUM=$(md5sum $OPENSTACK_IMAGE)
echo "MD5SUM is : ${MD5SUM}"
gl image-create --disk-format raw --container-format bare --name "$OS_IMGNAME" --file "$OPENSTACK_IMAGE" >"${ROOT}/../registry/openstack_image"
@Doeme ok, so it was the unmounting, or possibly executing losetup -d /dev/loop2
. What fixed it, is calling cleanup
before the openstack image upload:
bs1 ~ # cat /usr/share/gebuilder/roots/stemgentoo/hooks/openstack_image/post/60-upload_image.sh
#!/bin/bash
OS_USER="???"
OS_PW="???"
OS_TENANT="???"
OS_IMGNAME="stemgentoo"
function gl(){
glance --os-username "$OS_USER" \
--os-password "$OS_PW" \
--os-tenant-name "$OS_TENANT" \
--os-auth-url https://cloud.s3it.uzh.ch:5000/v2.0 \
--os-image-api-version 2 "$@"
}
if [ -f "${ROOT}/../registry/openstack_image" ]
then
UUID="$(sed -n 's/|[[:blank:]]\+id[[:blank:]]\+|[[:blank:]]\+\([a-z0-9\-]\+\)[[:blank:]]\+|/\1/p' "${ROOT}/../registry/openstack_image")"
#debug "Deleting old image with uuid $UUID"
gl image-delete "$UUID" || true
else
echo "lala"
#ensure_dir "${ROOT}/../registry/"
fi
#debug "Uploading new image with name $OS_IMGNAME"
cleanup
echo "Image is: ${OPENSTACK_IMAGE}"
MD5SUM=$(md5sum $OPENSTACK_IMAGE)
echo "MD5SUM is : ${MD5SUM}"
gl image-create --disk-format raw --container-format bare --name "$OS_IMGNAME" --file "$OPENSTACK_IMAGE" >"${ROOT}/../registry/openstack_image"
The checksum inconsistency also - quite predictably - disappears:
[...]
Executing openstack_image/stemgentoo/50-restore_root.sh
Ensuring /usr/share/gebuilder/roots/stemgentoo/root/../logs/openstack_image/ is a directory
executing scripts /usr/share/gebuilder/roots/stemgentoo/root/../hooks/openstack_image/post/60-upload_image.sh
Executing /usr/share/gebuilder/roots/stemgentoo/root/../hooks/openstack_image/post/60-upload_image.sh
Ensuring /usr/share/gebuilder/roots/stemgentoo/root/../logs/openstack_image/ is a directory
No image with an ID of '3e33a5fa-8633-4356-8219-9b40ddd3489e' exists.
Cleaning up
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/tmp"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/var/tmp/portage"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/sys"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/proc"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev/pts"
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev"
executing umount /usr/share/gebuilder/roots/stemgentoo/root/../mnt
executing losetup -d /dev/loop2
Image is: /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
MD5SUM is : 9c6703fbcf59504e9e8baf6d6593115a /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
Finished succesfully
Cleaning up
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/tmp"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/tmp: not found
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/var/tmp/portage"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/var/tmp/portage: not found
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/sys"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/sys: not found
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/proc"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/proc: not found
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev/pts"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev/pts: not found
executing umount -R "/usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev"
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt/dev: not found
executing umount /usr/share/gebuilder/roots/stemgentoo/root/../mnt
umount: /usr/share/gebuilder/roots/stemgentoo/root/../mnt: not mounted.
executing losetup -d /dev/loop2
losetup: /dev/loop2: detach failed: No such device or address
roots/stemgentoo/hooks/openstack_image/chain
bs1 ~ # md5sum /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
9c6703fbcf59504e9e8baf6d6593115a /usr/share/gebuilder/roots/stemgentoo/root/../openstack_images//image_20180220
This hack leads to some inelegant error messages when the builtin cleanup
is called, so maybe there's a better way to solve this. Particularly because clearly, at some point in the past, this was not an issue.
I am thinking maybe as part of the Docker kernel update, something subtly changed how loopbacks are managed - though this theory places the needle in a rather deep haystack. Maybe it's something a lot more banal?
Ah, I see. This seems to be a shortcoming of the cleanup routine. A hacky fix would be to introduce a new command openstack_image_upload that gets chained after openstack_image. But I think a much more elegant method would be stack-saving for the cleanup-stack, i.e. a call to cleanup_stack_save() marked a position in the stack, and a call cleanup_stack_restore() would execute all cleanup tasks added to the stack after cleanup_stack_save() was called. Hence, we could stack_save() before https://github.com/IBT-FMI/gebuilder/blob/master/gebuilder/scripts/openstack_image/default/15-mount_image.sh#L9 and stack_restore() before uploading the image.
Whops, this was actually intended for a non-master branch, but it seems I forgot to switch before commiting. So the master containts untested bigger changes to openstack_image
seems to work.
Continuing here, since this seems to not be related to
btrfs
, also all examples here are usingext4
unless otherwise indicated.@Doeme continued from here:
I pasted the entire logs earlier, but, upon looking again, I see nothing suspicious in that (or any other sections):
I also tried sourcing the
OPENSTACK_IMG_UUID
here (no idea if this is right, butUUID
was in any case always empty):And still to no avail.