Closed mambelli closed 2 months ago
I checked on a stock SL7, the file is in openssh-clients
and the permissions ha SGID, appatrently to avoid a strace vulnerability
[root@1a401ba6d143 /]# yum install openssh-clients
...
[root@1a401ba6d143 /]# ls -al /usr/bin/ssh-agent
---x--s--x 1 root nobody 382208 Aug 1 2023 /usr/bin/ssh-agent
[root@1a401ba6d143 /]# ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-DCBiHmjYcJFy/agent.52; export SSH_AUTH_SOCK;
SSH_AGENT_PID=53; export SSH_AGENT_PID;
echo Agent pid 53;
The permission on other RHEL versions is different.
The fnal-dev-sl7 container that we build has SGID set and works fine (I pulled it off Docker Hub)
[root@fermicloud826 ~]# podman run -it docker.io/fermilab/fnal-dev-sl7:latest /bin/bash
[root@d803b189831e /]# ls -al /usr/bin/ssh-agent
---x--s--x 1 root nobody 382208 Aug 1 2023 /usr/bin/ssh-agent
[root@d803b189831e /]# ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-jdMGuDfFFO41/agent.21; export SSH_AUTH_SOCK;
SSH_AGENT_PID=22; export SSH_AGENT_PID;
echo Agent pid 22;
[root@d803b189831e /]#
There may be some changes in the process creating the Apptainer image (SIF) or expanding in CVMFS. It may remove SGID permissions. Will have to check with @DrDaveD or OSG
Apptainer has the No New Priviileges kernel feature set. Even if the SGID bit is set, it is not used.
Vito made on Ceph a dump of the container with apptainer build --sandbox
and the resulting container works,
possibly is a CVMFS feature that could need to handle permission in a specific way
When starting the apptainer container from CVMFS, for example using
/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer exec /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest /bin/bash
and an application, or user, tries to run ssh-agent
it report permission denied:
$ /cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer exec /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest /bin/bash
Apptainer> ssh-agent
bash: /usr/bin/ssh-agent: Permission denied
Apptainer> ls -lh /usr/bin/ssh-agent
-rwx--x--x 1 65534 65534 374K Aug 1 2023 /usr/bin/ssh-agent
Apptainer>
This seems to happen with all SL7 container I tested from CVMFS. While, as mentioned in te previous post, if the container is dumped on Ceph volume it has permissions that allow users to use it,
On the other side, testing EL8/EL9 or even SL6 container, ssh-agent
works:
$ /cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer exec /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-el9:latest /bin/bash
Apptainer> ls -lh /usr/bin/ssh-agent
-rwxr-xr-x 1 nobody nobody 281K Mar 5 16:34 /usr/bin/ssh-agent
Apptainer> ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-XXXXXXl3DkwY/agent.167678; export SSH_AUTH_SOCK;
SSH_AGENT_PID=167687; export SSH_AGENT_PID;
echo Agent pid 167687;
Apptainer>
somehow this seems to be an issue with the combination SL7 contained deployed on CVMFS.
I'm skeptical that it works with the SL7 container inside apptainer from a sandbox. It didn't work for me. Can you please double-check that @vitodb ?
I did
$ apptainer build --sandbox /scratch/tmp/fnal-dev-sl7 docker://fermilab/fnal-dev-sl7
...
$ apptainer exec /scratch/tmp/fnal-dev-sl7 /bin/bash
Apptainer> ls -l /usr/bin/ssh-agent
---x--s--x 1 dwd fnalgrid 382208 Aug 1 2023 /usr/bin/ssh-agent
Apptainer> ssh-agent
mkdtemp: private socket dir: No such file or directory
On the other hand, if I change those permissions to 711 and execute outside of apptainer it works. So I'm not exactly sure why the creation of a private socket dir is failing.
It also seems like not a good security model to run ssh-agent inside an apptainer container. Maybe the user has an alternative. Normally ssh-agent is run on a desktop or laptop and forwarded everywhere through ssh.
Ok I have done more investigation. The error I was seeing was because I had set TMPDIR=/scratch/tmp
but I didn't bind that in from the host. If I add -B /scratch
then I see the same symptoms as Vito.
The problem is not a missing SGID bit, however. ssh-agent works fine without it; I don't know what it is for. It is something about that being mode 711 while not being the owner of the file. If I build my own sandbox I am the owner so it doesn't matter. If I change the owner and group of the file in my sandbox to someone else, it also fails with Permission denied. If I then change the mode to 755, it works ok. So a workaround is to do chmod 755 /usr/bin/ssh-agent
in the Dockerfile of that container.
The following was reported by @vitodb via Slack Tom tested the SL7 dev container on SBND nodes. Using the DDT debugger distributed through forge_tools UPS package, he is getting an error related to ssh-agent permissions. In the container distributed via CVMFS, permission for this executable are
this is causing permissions issues to execute the command. On the host node it has permissions
this is working fine. Another check I have done is on the local container in /exp/sbnd/data/... it got permissions
this is also working, it looks like has the setgid bit on, so this seems to allow it to work