apptainer / apptainer

Apptainer: Application containers for Linux
https://apptainer.org
Other
1k stars 129 forks source link

Behavior changed for /proc mount #2344

Closed fnaum closed 1 week ago

fnaum commented 2 weeks ago

Version of Apptainer

appatiner-1.2.5-1.el9

Expected behavior

python -c "import os; print(os.getlogin())"
federico

Actual behavior

python -c "import os; print(os.getlogin())"
FileNotFoundError: [Errno 2] No such file or directory

Steps to reproduce this behavior

  1. Start in a rocky-9.2 machine
  2. Install apptainer using dnf
  3. Using a CentOS-7.8 or rocky-9.2 image, run the following command
    apptainer exec  oras://xxxx/centos-devel:7.8.2003 python -c "import os; print(os.getlogin())"

What OS/distro are you running

$ cat /etc/os-release

Rocky Linux release 9.2 (Blue Onyx)

How did you install Apptainer

dnf module for Ansible, dnf install apptainer-1.2.5 should be the equivant

(On CentOS-78 we install singularity using yum)

fnaum commented 2 weeks ago

There is more info in this Slack thread But here are some findings:

In other words, on rocky-9 /apptainer-1.2.5 (or 1.3.2) works in interactive-login shell but it does not work on interactive-non-login shells In centos-7.8 singularity version 3.8.0-1.el7, it works in in both cases. I will try to dig in the on our ~/.bash_profile or ~/.bash_login or ~/.profile to see if I can find something else, but as @DrDaveD was able to reproduce this, I will appreciate someone shading some light I will love it is just configuration switch that I can turn in order to get the same behavior as we had in CentOS

fnaum commented 2 weeks ago

I tracked down the difference between the version installed with dnf and with the one built from source, and it was a something we changed a while ago in the configuration

# CONFIG PASSWD: [BOOL]
# DEFAULT: yes
# If /etc/passwd exists within the container, this will automatically append
# an entry for the calling user.
config passwd = no

Changing this back to the default config passwd = yes makes it work for us in interactive-login and interactive-non-login shells and we do not observe the reported issue anymore

We will revert the change because we only changed that for a test case that was checking that /etc/passwd was NOT writable but I will leave this issue opened to understand this issue more as the behavior in singularity 3.8.0-1.el7 is different, that is everything works with config passwd = no

DrDaveD commented 2 weeks ago

I just built singularity-3.8.0 from source and get the same error when I set config passwd = no.

fnaum commented 2 weeks ago

We found a similar issue on singularity 3.8.0-1.el7 with config passwd = no Repro

> touch  $HOME/bla
> singularity exec  oras://artifactory.XXXXX:XXXX/singularity-toolchain/centos-devel:7.8.2003 python -c"import os;import pwd; print(pwd.getpwuid(os.stat(os.path.expandvars('$HOME/bla')).st_uid))"
KeyError: 'getpwuid(): uid not found: 9426'

That also happens in apptainer-1.3.2

In this case the behaviour is consistent between singularity and apptainer

DrDaveD commented 2 weeks ago

The behavior is really the kernel and whether or not it makes the relevant info available under /proc. I tried it on an el8 kernel, but perhaps older kernel behaviors are slightly different. There's not really anything that apptainer can do about it. Are you ready to close the issue?

fnaum commented 1 week ago

Sorry for the delay in the reply.

We have "solved" our pressing issue but just reverting the config to config passwd = yes

I'm happy to close the issue but I left it open to see if someone could explain or shed some light on what I observed.

I would like to know what else changes when that config is changed from yes to no besides the /etc/paswd file becoming not writable.

My observation when we have config passwd = no was that I could see the file and the contents of /proc/self/loginuid inside the container but then when Python tries to access that file we see that the file does not exist.

>strace python -c "import os; print(os.getlogin())"
...
open("/proc/self/loginuid", O_RDONLY) = -1 ENOENT (No such file or directory)
ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0

So I'm wondering what changes are done at /proc level because the config name affecting /proc is not intuitive.

if you do not have an answer to those questions, I'm also happy to close it.

DrDaveD commented 1 week ago

Whatever's happening in /proc is outside of Apptainer's control, so I don't have an answer for you.

fnaum commented 1 week ago

Thanks for your help anyways!