apptainer / singularity

Singularity has been renamed to Apptainer as part of us moving the project to the Linux Foundation. This repo has been persisted as a snapshot right before the changes.
https://github.com/apptainer/apptainer
Other
2.53k stars 424 forks source link

Using --fakeroot --net on a CentOS7 works but on CentOS8 network is unusable #5840

Closed paulraines68 closed 3 years ago

paulraines68 commented 3 years ago

Version of Singularity:

What version of Singularity are you using?

3.7.0-1.el7

Expected behavior

With username spaces setup in subuid/subgid on CentOS8 I expect as a user in subuid I can run --fakeroot --net and have a usable network. I need to do this to modify the container (which is Ubuntu based) with apt-get calls.

Actual behavior

I did not get a usable network

Steps to reproduce this behavior

On a CentOS8 box as normal user I did:

raines$ singularity build --fakeroot --sandbox tensorflow-20.11-tf2-py3     /cluster/batch/IMAGES/tensorflow-20.11-tf2-py3.sif
INFO:    Starting build...
INFO:    Verifying bootstrap image /cluster/batch/IMAGES/tensorflow-20.11-tf2-py3.sif
WARNING: integrity: signature not found for object group 1
WARNING: Bootstrap image could not be verified, but build will continue.
INFO:    Creating sandbox directory...
INFO:    Build complete: tensorflow-20.11-tf2-py3
raines$ singularity shell tensorflow-20.11-tf2-py3
Singularity> wget -O - https://172.21.21.45/
--2021-02-18 11:14:50--  https://172.21.21.45/
Connecting to 172.21.21.45:443... connected.
    ERROR: certificate common name ‘hound.nmr.mgh.harvard.edu’ doesn't match requested host name ‘172.21.21.45’.
To connect to 172.21.21.45 insecurely, use `--no-check-certificate'.
Singularity> exit
exit
raines$ singularity shell --fakeroot --writable --net tensorflow-20.11-tf2-py3
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
Singularity> wget -O - https://172.21.21.45/
--2021-02-18 16:15:08--  https://172.21.21.45/
Connecting to 172.21.21.45:443... failed: No route to host.
Singularity> exit
exit

Both work on a CentOS7 box just fine.

I tried on CentOS8 with SELINUX both in enforcing and not-enforcing mode

What OS/distro are you running

Running CentOS Linux release 8.2.2004 with kernel 4.18.0-193.28.1.el8_2.x86_64

Works on CentOS Linux release 7.9.2009 with kernel 3.10.0-1160.6.1.el7.x86_64

How did you install Singularity

Downloaded tarball and did an rpmbuild --rebuild on the two above machines then installed produced rpms

dtrudg commented 3 years ago

Can you check your firewalld / iptables setup for any differences between the CentOS7 and CentOS8 host? What do you have setup on each?

I am unable to replicate so far. I can use --fakeroot --net and have access to the outside world from my CentOS 8.2 VM.

$ singularity exec --fakeroot --net docker://curlimages/curl sh
INFO:    Using cached SIF image
INFO:    Converting SIF file to temporary sandbox...
Singularity> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 9a:de:9a:0e:a6:c5 brd ff:ff:ff:ff:ff:ff
    inet 10.23.0.16/16 brd 10.23.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::98de:9aff:fe0e:a6c5/64 scope link tentative 
       valid_lft forever preferred_lft forever
Singularity> curl http://1.1.1.1
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center>
<hr><center>cloudflare-lb</center>
</body>
</html>
paulraines68 commented 3 years ago

It does appear to be the firewall. It works if I disable the firewall on CentOS8. But why would the firewall affect running with --fakeroot differently than running without?

Nothing in the firewall definition is blocking outgoing traffic. I just have the one zone defined as:

# cat /etc/firewalld/zones/public.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
  <short>Public</short>
  <description>For use in public areas. You do not trust the other computers on networks to not harm your computer. Only selected incoming connections are accepted.</description>
  <service name="ssh"/>
  <service name="dhcpv6-client"/>
  <service name="nfs"/>
  <service name="mountd"/>
  <port port="5900-5999" protocol="tcp"/>
  <port port="111" protocol="tcp"/>
  <port port="111" protocol="udp"/>
  <port port="32803" protocol="tcp"/>
  <port port="32769" protocol="udp"/>
  <port port="662" protocol="tcp"/>
  <port port="662" protocol="udp"/>
  <port port="2049" protocol="udp"/>
  <port port="5666" protocol="tcp"/>
  <port port="6817" protocol="tcp"/>
  <port port="6819" protocol="tcp"/>
  <port port="6818" protocol="tcp"/>
  <port port="6820-6829" protocol="tcp"/>
  <port port="60001-63000" protocol="tcp"/>
</zone>

On my CentOS7 the firewall is basically the same with even less ports open (but none open that are not open above).

Ah, OK, found it.

CentOS8 now uses nft instead of iptables by default. If I modify the firewalld.conf to use

FirewallBackend=iptables

and restart the firewall, the network now works when using --fakeroot

I have no idea why though this works.

dtrudg commented 3 years ago

It appears the CNI firewall plugin does not directly support nftables at this time, so it may not put in place the correct rules for connectivity when using nftables.

I suspect the reason this was working for me, and not you, might be due to my firewall being more open, perhaps in the FORWARD chain (rather than OUTPUT).

paulraines68 commented 3 years ago

Ok, thanks. This makes sense as I have to convert from nftables to iptables also to make Docker work.