IBT-FMI / gebuilder

Gentoo System and Image Builder
GNU General Public License v3.0
11 stars 0 forks source link

numpy /dev/shm bug #5

Closed TheChymera closed 6 years ago

TheChymera commented 6 years ago

It seems gebuilder is triggering the Gentoo 496328 bug, when initializing SAMRI.gentoo at least. Any idea what can be don about this?

If there's nothing obvious one could do, getting access to the chrooted environment after the build fails, at least would help speed up a trial/error approach.

Doeme commented 6 years ago

Hm, actually, /dev/shm should be mounted... https://github.com/IBT-FMI/gebuilder/blob/master/gebuilder/scripts/common/mount_basic.sh

TheChymera commented 6 years ago

Is it? I can't find the shm string anywhere in gebuilder.

Doeme commented 6 years ago

https://github.com/IBT-FMI/gebuilder/blob/b94200df3bb9825b253e906a414e3d970ebe439b/gebuilder/scripts/common/mount_basic.sh#L10

/dev/shm should be mounted in the host system

TheChymera commented 6 years ago

this is very strange - it keeps failing, but as soon as I go into the chroot afterwards with:

root #mount --rbind /dev /mnt/mychroot/dev
root #mount --make-rslave /mnt/mychroot/dev
root #mount -t proc /proc /mnt/mychroot/proc
root #mount --rbind /sys /mnt/mychroot/sys
root #mount --make-rslave /mnt/mychroot/sys
root #mount --rbind /tmp /mnt/mychroot/tmp
root #chroot /mnt/mychroot /bin/bash
root #source /etc/profile
root #env-update
root #export PS1="(chroot) $PS1"

it works. Any idea what could be wrong?

Doeme commented 6 years ago

Hm, that's strange indeed. I'd suggest making a new directory in scripts/ for a "shell" command, linking scripts/common/mount_basic.sh into the directory, and adding echo "bash">>scripts/shell/default/xy-shell.sh.chroot" This should open up a shell inside the image, and you can test the environment then

TheChymera commented 6 years ago

I don't think I understand what you mean exactly. What would this be used for? just a meta-command for chrooting in after the initialize command fails?

TheChymera commented 6 years ago

This seems part of a larger set of issues which happen when the SAMRI system is built automatically. An other example being that the sys-devel/flex build seems to have some issues, which then crash the x11-libs/motif build system. These issues, however, also resolve if I simply chroot and then rebuild flex and motif by calling on them specifically.

My best guess is that the dependency resolution for the SAMRI .gentoo specification - but not for e.g. StereotaXYZ, leads to an unfortunate build order. @Doeme thoughts?

Doeme commented 6 years ago

I'm not sure whether these two problems have the same cause. /dev/shm looks like the build environment is somehow broken, the sys-devel/flex sounds more like a gcc-6.0-update problem.

TheChymera commented 6 years ago

Having read through the original bug report it seems you are right. The two issues are likely unrelated, still, I haven't quite understood your suggestion regarding what we should do about the /dev/shm bug.

Doeme commented 6 years ago

https://github.com/IBT-FMI/gebuilder/commit/9e802e58ff59d157bbdd34d0df694dacf17812ae

TheChymera commented 6 years ago

@Doeme this doesn't seem to drop me into a shell when it fails, I just get the usual:

  *
 * Call stack:
 *     ebuild.sh, line 124:  Called src_configure
 *   environment, line 3620:  Called die
 * The specific snippet of code:
 *           die "Broken sem_open function (bug 496328)";
 *
 * If you need support, post the output of `emerge --info '=dev-lang/python-2.7.14-r1::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-lang/python-2.7.14-r1::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-lang/python-2.7.14-r1/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-lang/python-2.7.14-r1/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-lang/python-2.7.14-r1/work/x86_64-pc-linux-gnu'
 * S: '/var/tmp/portage/dev-lang/python-2.7.14-r1/work/Python-2.7.14'

 * IMPORTANT: 10 news items need reading for repository 'gentoo'.
 * Use eselect news read to view new items.

Exiting
Cleaning up
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/tmp"
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/var/tmp/portage"
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/sys"
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/proc"
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/dev/pts"
executing umount "/usr/share/gebuilder/roots/06efa8964d90b8c7e3dc8718d7e8aff5148601488a48569d84c9e8fac2dc9cc3/root/dev"
Cleaning up after error
TheChymera commented 6 years ago

anyway, the dropping to shell may not be needed.

edbe242ecf7604f093e80b1bbc916a2e6d79bc27 apparently solves this issue.

@Doeme is there any reason you opted for --bind instead of --rbind? And in light of that should any of the other occurences be updated?

Doeme commented 6 years ago

It used to be --bind when I first installed gentoo, and since I never noticed any different behaviour (except making it more difficult to unmount) I saw no reason to switch.

TheChymera commented 6 years ago

Ah, ok, that would also explain why I always had issues with unmounting the manual chroot. It works with lazy umount, but I'm unsure whether that's a robust solution. Are you aware of any issues that could cause in our context?

Doeme commented 6 years ago

You mean except that unmounting will fail? I guess not. Does the updated bind-mounting ritual come with an updated unmount-ritual too? If so, we should implement that, since leaving things mounted is imho not an option.

TheChymera commented 6 years ago

The Gentoo handbook recommends:

umount -f /mnt/mychroot/dev > /dev/null

but that won't work.

I used:

umount -f -l /mnt/mychroot/dev > /dev/null

It didn't blow up in my face, but how can I check whether it worked? My problem was that I couldn't delete the roots directory of the respective .gentoo directory before unmounting, but with the above command I can :/

Doeme commented 6 years ago

I don't like that approach. Since the recursively bindmounted submounts also show up in /etc/mtab (or the mount output if run with no options), maybe we can parse the output and prior to unmounting /roots/<ID>/root/dev/ unmount these directories?

TheChymera commented 6 years ago

can you write a script for that?

would a for dir in /mnt/mychroot/dev/* construction work? I assume we're only talking about the umount here, and the mount command stays mount --rbind /dev /mnt/mychroot/dev?

Doeme commented 6 years ago

I just saw that umount supports recursive unmounting with umount -R. That makes it easy.