Open changyp6 opened 3 years ago
After 5 days investigation on this issue, I have finally found the root cause.
on line 816 of file /usr/libexec/mock/mock
816: unshare_namespace(config_opts)
in this function, util.CLONE_NEWNS
is used to unshare mount group namespace.
After commenting out line 816, mock works in my NFS rootfs linux system.
I'm not sure if this is a bug, or it is intended to unshare all mount points.
If /var/lib/mock is mounted to other location, after calling "unshare_namespace", all data won't be written to the mounted device of /var/lib/mock, they are written to the original /var/lib/mock folder.
Hope this can help others with this same issue.
Developer of mock, please help investigate on this issue to give a better solution.
Mock intentionally uses a separate namespace for mountpoints. This is expected.
I fail to see what is happening with the the /var/lib/mock folders in mock, though. Doing unshare - when the mount points are already mounted on the host - should have no effect on the unshared namespace, as the mountpoint list should be copied.
ERROR: Command failed: $ /bin/mount -n -t tmpfs -o rprivate tmpfs /var/lib/mock/dist-myos-bootstrap/root/proc
Why is this command failing for you? It is obvious it fails, but not why.
Mock intentionally uses a separate namespace for mountpoints. This is expected.
I fail to see what is happening with the the /var/lib/mock folders in mock, though. Doing unshare - when the mount points are already mounted on the host - should have no effect on the unshared namespace, as the mountpoint list should be copied.
My builder uses NFS mounted rootfs.
Everything of mock starts to work after unshare_namespace()
, but after unshare the original mount groups namespace, hw_info
plugin fails to work, because /proc
is also unshared in the new namespace, lscpu
, df
, free
in the new namespace all failed to work, because /proc
is empty in new namespace.
when /proc
is also unshared, original mountpoints are obviously not copied into the namespace.
I don't know if this issue is related to NFS.
Mock intentionally uses a separate namespace for mountpoints. This is expected.
I fail to see what is happening with the the /var/lib/mock folders in mock, though. Doing unshare - when the mount points are already mounted on the host - should have no effect on the unshared namespace, as the mountpoint list should be copied.
My understanding of this reply is that
'Mock should get the originally mounted mountpoints after "unshare_namespace", that's because /proc won't be affected by unshare_namespace'
which is unshare_namespace --> read mount points from /proc
I think in a hard-drive rootfs, mock works like this, and it is indeed working.
In my case, rootfs is NFS
'After unshare_namespace, /proc is also unshared in the separated namespace, nothing of the originally mounted mountpoints can be obtained.'
which is unshare_namespace --> /proc is unshared too --> cannot get mountpoints or read system information
My system is running kernel 5.14.32, with glibc 2.34, python is 3.10.0 rc1
My understanding of this reply is that 'Mock should get the originally mounted mountpoints after "unshare_namespace", that's because /proc won't be > affected by unshare_namespace'
Yes, that's what I meant. The effect of unsharing should be that we don't propagate mount events from the new namespace up to the parent namespace, but the list of mounts sholdn'ŧ be affected IMO, per `mount_namespaces(7):
* If the namespace is created using unshare(2), the mount point list of the new namespace is a copy of the mount point list in the caller's previous mount namespace.
I'm no sure what is going on, but it isn't trivial for me to setup a rootfs on NFS to test this scenario. What
propagation options is on your /proc (etc), and what happens when you unshare manually (with
/usr/bin/unshare
)?
My understanding of this reply is that 'Mock should get the originally mounted mountpoints after "unshare_namespace", that's because /proc won't be > affected by unshare_namespace'
Yes, that's what I meant. The effect of unsharing should be that we don't propagate mount events from the new namespace up to the parent namespace, but the list of mounts sholdn'ŧ be affected IMO, per `mount_namespaces(7):
* If the namespace is created using unshare(2), the mount point list of the new namespace is a copy of the mount point list in the caller's previous mount namespace.
I'm no sure what is going on, but it isn't trivial for me to setup a rootfs on NFS to test this scenario. What propagation options is on your /proc (etc), and what happens when you unshare manually (with
/usr/bin/unshare
)?
I tried to modify /usr/libexec/mock/mock
COPY_NEWUTS
from the extented_unshare_flags
, only unshare mount pointsNEWNS
, everything worksNEWNS
and NEWUTS
separately(call unshare(CLONE_NEWNS) and unshare(CLONE_NEWUTS)), no matter which is first, NEWUTS
will always fail, however, /proc still exists, mock worksCOPY_NEWNS
first, unshare will reports OK, then unshare CLONE_NEWNS | CLONE_NEWUTS, unshare will fail, and /proc
no longer exists, mock fails to workCOPY_NEWNS
then COPY_NEWNS | COPY_NEWUTS
, then CLONE_NEWNS
, /proc still missingAfter these tesing, I found that if calling unshare COPY_NEWNS | COPY_NEWUTS
failed, mount points no longer exists in the following environment, calling unshare (COPY_NEWNS)
doesn't help, won't bring /proc
back.
Then I did another test on my system, by running the following command
$sudo unshare -u
unshare: unshare failed: Invalid argument
My system will ask for a specific argument for UTS unshare operation
Above are my findings, hope these will help you locate the issue.
My theory is that, when calling unshare(CLONE_NEWNS | CLONE_NEWUTS)
together, if it fails, the following system is already in a new environment, however, this environment doesn't have mount_namespaces
and UTS_namespaces
from the parent system. I seems that when unshare
failed, everythings fail together, but still give you a new environment.
In this environment, calling unshare(CLONE_NEWNS)
again won't help, because in this environment, mount_namespace doesn't exist, that's why everything in /proc
is empty.
In the example program of unshare()
man pages, if unshare()
failed, the program just exit directly. How to handle unshare
failure, is not explained in the man page.
I have finally found the root cause of this issue!!!!
if unshare(CLONE_NEWUTS | CLONE_NEWNS) failed, even if unshare(CLONE_NEWNS) succeeded, the new mount_namespace is empty
The question is why unshare(CLONE_NEWUTS | CLONE_NEWNS)
fails.
The answer is IPC_namespace is NOT enabled in kernel
To solve such mock
issue, just to make sure that IPC namespace and MOUNT namespace
are all enabled in kernel.
However, the logic in mock
code is still wrong, mock
should exit immediately if unshare(CLONE_NEWUTS | CLONE_NEWNS)
fails. Any successful unshare
operation after a failed unshare
, is not working for mock
And I suggest mock
record this into mock
's document, to require IPC namespace
and mount namespace
features in kernel.
Thank you for the info, @changyp6 - I'll keep this open.
However, the logic in mock code is still wrong, mock should exit immediately if unshare(CLONE_NEWUTS | CLONE_NEWNS) fails.
We really need to take a look at the unshare logic.
Short description of the problem
I have setup a koji-based aarch64 build server, this server uses NFS mount as its rootfs. I also mounted /var/lib/mock and /var/cache/mock to hard disk.
When I run mock, mock hw_plugin reports
And finally failed with error messages:
After seeing these errors, I started to search for the "results" folder, and I found that "dist-myos-bootstrap" folder is NOT in mounted "/var/lib/mock" folder, instead, the "results" folder is in the NFS rootfs /var/lib/mock folder.
This explains why lscpu failed with error "/sys/devices/system/cpu/possible: No such file or directory", that because /sys has no contents in the NFS rootfs, it is mounted in the running system.
I'm so confused, why mock runs command on top of the "physical" root device instead of the logical root device ?
My logical rootfs has /sys /proc all mounted, and /var/lib/mock mounts to hard disk, /var/cache/mock mounts to hard disk My physical rootfs is a NFS rootfs
Output of
rpm -q mock
mock-2.12
Steps to reproduce issue
"console=ttyS0 root=/dev/nfs rw rootfstype=nfs nfsroot=NFS_SERVER_IP:/path/to/nfsroot,nolock,vers=4.2 ip=dhcp"
Additional Information
/var/lib/mock
and/var/cache/mock
are all set correctly with 02775 permissions androot:mock
ownership.lscpu
manually can get correct results.python3 -c "import subprocess; subprocess.Popen('/usr/bin/lscpu')"
in console manually can get the correct result.