Closed martinpitt closed 2 years ago
Normally we start this without --privileged
. However, that now does not work any more:
systemd v248~rc2-3.fc35 running in system mode. (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
Failed to create /system.slice/docker-598dfa5b27446c79b3a3028a6087a54e309b9adb69cee53e2e747eb052140297.scope/init.scope control group: Operation not permitted
Failed to allocate manager object: Operation not permitted
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
But this is just fallout from the faccessat() glibc regression. That can be worked around with --security-opt=seccomp=unconfined
or --privileged
, but both now fail with the crash loop above.
Same result for freeipa/freeipa-server:fedora-33 (ef06f18112ff from 3 hours ago) and freeipa/freeipa-server:fedora-33-4.9.1 (98721900393a from 2 weeks ago).
Note that I ran each of these with an empty /var/lib/ipa-data, so it's not due to some old data.
Our previous VM image refresh with the freeipa container was on Feb 4, that still worked. That used the fedora-rawhide image ffac6c661a58 from 3 months ago. That tag is now gone on both quay and dockerhub, though.
I tested freeipa/freeipa-server:centos-7 (to match the host OS), and it fails much more quickly:
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization other.
Detected architecture x86-64.
Set hostname to <f0.cockpit.lan>.
Initializing machine ID from random generator.
Checking DNS domain cockpit.lan, please wait ...
Wed Mar 10 08:35:37 UTC 2021 /usr/sbin/ipa-server-configure-first
The log file for this installation can be found in /var/log/ipaserver-install.log
==============================================================================
This program will set up the IPA Server.
This includes:
* Configure a stand-alone CA (dogtag) for certificate management
* Create and configure an instance of Directory Server
* Create and configure a Kerberos Key Distribution Center (KDC)
* Configure Apache (httpd)
* Configure DNS (bind)
* Configure the KDC to enable PKINIT
Excluded by options:
* Configure the Network Time Daemon (ntpd)
Warning: skipping DNS resolution of host f0.cockpit.lan
Checking DNS domain cockpit.lan., please wait ...
The IPA Master Server will be configured with:
Hostname: f0.cockpit.lan
IP address(es): 172.17.0.2
Domain name: cockpit.lan
Realm name: COCKPIT.LAN
BIND DNS server will be configured to serve IPA domain with:
Forwarders: No forwarders
Forward policy: only
Reverse zone(s): No reverse zone
Configuring directory server (dirsrv). Estimated time: 30 seconds
[1/45]: creating directory server instance
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
[2/45]: enabling ldapi
[3/45]: configure autobind for root
[4/45]: stopping directory server
[5/45]: updating configuration in dse.ldif
[6/45]: starting directory server
[7/45]: adding default schema
[8/45]: enabling memberof plugin
[9/45]: enabling winsync plugin
[10/45]: configure password logging
[11/45]: configuring replication version plugin
[12/45]: enabling IPA enrollment plugin
[13/45]: configuring uniqueness plugin
[14/45]: configuring uuid plugin
[15/45]: configuring modrdn plugin
[16/45]: configuring DNS plugin
[17/45]: enabling entryUSN plugin
[18/45]: configuring lockout plugin
[19/45]: configuring topology plugin
[20/45]: creating indices
[21/45]: enabling referential integrity plugin
[22/45]: configuring certmap.conf
[23/45]: configure new location for managed entries
[24/45]: configure dirsrv ccache
[25/45]: enabling SASL mapping fallback
[26/45]: restarting directory server
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
[27/45]: adding sasl mappings to the directory
[28/45]: adding default layout
[29/45]: adding delegation layout
[30/45]: creating container for managed entries
[31/45]: configuring user private groups
[32/45]: configuring netgroups from hostgroups
[33/45]: creating default Sudo bind user
[34/45]: creating default Auto Member layout
[35/45]: adding range check plugin
[36/45]: creating default HBAC rule allow_all
[37/45]: adding entries for topology management
[38/45]: initializing group membership
[39/45]: adding master entry
[40/45]: initializing domain level
[41/45]: configuring Posix uid/gid generation
[42/45]: adding replication acis
[43/45]: activating sidgen plugin
[44/45]: activating extdom plugin
[45/45]: configuring directory to start on boot
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Done configuring directory server (dirsrv).
Configuring Kerberos KDC (krb5kdc)
[1/10]: adding kerberos container to the directory
[2/10]: configuring KDC
[3/10]: initialize kerberos container
[4/10]: adding default ACIs
[5/10]: creating a keytab for the directory
[6/10]: creating a keytab for the machine
[7/10]: adding the password extension to the directory
[8/10]: creating anonymous principal
[9/10]: starting the KDC
[10/10]: configuring KDC to start on boot
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Done configuring Kerberos KDC (krb5kdc).
Configuring kadmin
[1/2]: starting kadmin
[2/2]: configuring kadmin to start on boot
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Done configuring kadmin.
Configuring ipa-custodia
[1/5]: Making sure custodia container exists
[2/5]: Generating ipa-custodia config file
[3/5]: Generating ipa-custodia keys
[4/5]: starting ipa-custodia
[5/5]: configuring ipa-custodia to start on boot
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Done configuring ipa-custodia.
Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
[1/30]: configuring certificate server instance
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
[2/30]: secure AJP connector
[3/30]: reindex attributes
[4/30]: exporting Dogtag certificate store pin
[5/30]: stopping certificate server instance to update CS.cfg
[6/30]: backing up CS.cfg
[7/30]: disabling nonces
[8/30]: set up CRL publishing
[9/30]: enable PKIX certificate path discovery and validation
[10/30]: starting certificate server instance
[11/30]: configure certmonger for renewals
Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists
Failed to create unit file /run/systemd/generator.late/network.service: File exists
[12/30]: requesting RA certificate from CA
xargs: /usr/sbin/ipa-server-install: terminated by signal 9
FreeIPA server configuration failed.
For the record: I checked the history, and it seems the only reason to use the :fedora-rawhide tag was that there was/is no :latest
tag any more, and you recommended us to use rawhide instead (that would also give us the latest version to test against, which spots errors earlier).
Can you try tests/run-partial-tests.sh Dockerfile.fedora-rawhide
on that box / setup to see if that passes? You might need to patch it with that --security-opt=seccomp=unconfined
at https://github.com/freeipa/freeipa-container/blob/master/tests/run-partial-tests.sh#L28.
I don't see things failing on my Fedora 33 (even if I see the glibc/seccomp issue here) and it will take me some time to setup a RHEL 7 box to try to reproduce.
Yes, about the tags -- people complained that IPA gets upgraded to latest version when they used :latest
(and when that happens after long time, the upgrade might fail because it's upgrade both across Fedora versions and across FreeIPA versions).
So we started to tag with specific FreeIPA versions as well in https://hub.docker.com/r/freeipa/freeipa-server/tags?page=1&ordering=last_updated and https://quay.io/repository/freeipa/freeipa-server?tab=tags. The rawhide image hasn't been built for a while, exactly because I did not want to break people's setups with that glibc issue ... but then I figured there is no point waiting if that change is there to stay and people need to workaround for example with --security-opt=seccomp=unconfined
.
As for running these images on RHEL 7 hosts, it's mostly outside of my capacity to test there, I'm generally happy when things pass on my Fedoras and on GitHub Actions' Ubuntus (we no longer test on Travis CI because we were not approved for the OSS credits (yet?)).
Could you try the same on RHEL 8 machine, likely with podman
? The tests that we have are the tests/run-partial-tests.sh Dockerfile.<version>
which also tests basic systemd operation in the container before even attempting to configure FreeIPA, and tests/run-master-and-replica.sh <image>
which is primarily for using the "real" image, testing master and replica configuration.
Simplest way to reproduce is with the CentOS 7 cloud image:
curl -L -O https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2.xz
xz -d CentOS-7-x86_64-GenericCloud.qcow2.xz
# nothing fancy, just admin:foobar and root:foobar
curl -L -O https://github.com/cockpit-project/bots/raw/master/machine/cloud-init.iso
qemu-system-x86_64 -enable-kvm -nographic -m 2048 -drive file=CentOS-7-x86_64-GenericCloud.qcow2,if=virtio -snapshot -cdrom cloud-init.iso -net nic,model=virtio -net user,hostfwd=tcp::2201-:22
Note: you can also log in on the VT, but ssh login with `` is a bit more comfortable:
ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o CheckHostIP=no -p 2201 root@localhost
(Password "foobar").
Then:
yum install -y docker
systemctl start docker
setsebool -P container_manage_cgroup 1
# see https://github.com/freeipa/freeipa-container/issues/348
rm /usr/libexec/oci/hooks.d/oci-systemd-hook
mkdir /var/lib/ipa-data
docker run -it --rm --privileged --name freeipa -ti -h f0.cockpit.lan --read-only -e IPA_SERVER_IP=10.111.112.100 -p 53:53/udp -p 53:53 -p 80:80 -p 443:443 -p 389:389 -p 636:636 -p 88:88 -p 464:464 -p 88:88/udp -p 464:464/udp -p 123:123/udp -v /var/lib/ipa-data:/data:Z -v /sys/fs/cgroup:/sys/fs/cgroup:ro freeipa/freeipa-server:fedora-rawhide -U -p foobarfoo -a foobarfoo -n cockpit.lan -r COCKPIT.LAN --setup-dns --no-forwarders --no-ntp
I'll try to move our image to CentOS 8 stream. We need to do that at some point anyway. I'll report back here how it works on CentOS 8.
Indeed the current container works on CentOS 8 stream. (Unfortunately https://github.com/candlepin/ansible-role-candlepin is still not ported to RHEL/CentOS 8, so we are kind of stuck there, but I'll see what we can do there)
@martinpitt, I assume you have found reasonably stable setup for your use case. Is there anything else we should investigate or do as part of this issue?
@adelton : Yes, I applied a big hammer to candlepin and moved the whole host to Fedora CoreOS. So this does not block us any more. I suppose you can close this if you don't want to support running on RHEL/CentOS 7 any more.
I seem to run into this as well, I'm not sure but I think I do.
With normal command I end up with:
Adding [10.1.0.3 ipa-01.foo.tld] to your /etc/hosts file
[Errno 30] Read-only file system: '/etc/hosts'
The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information
Sending SIGTERM to remaining processes...
Sending SIGKILL to remaining processes...
Any comments ?
If you see the error about /etc/hosts
, it's a different problem than originally reported by Martin which was about running latest systemd in container on CentOS 7. It's incidently one that I filed earlier today as https://pagure.io/freeipa/issue/8888 against FreeIPA.
I assume that you've run the container as read-only and with --hostname=ipa-01.foo.tld --ip-address=10.1.0.3
or similar parameters. The ipa-server-install
installer is eager to add those value to /etc/hosts
. One possibility to avoid that is to use something like --add-host ipa-01.foo.tld:10.1.0.3
option to docker run
or podman run
to make the installer happy to find the records already there.
I run into this "No valid Negotiate header in server response" issue today after upgrade from fedora-33-4.9.2
image to fedora-34-4.9.6
.
I first hit "Failed to allocate manager object.", then I read this thread, and added --privileged
. Then I got "No valid Negotiate header in server response" for any ipa command. However --security-opt=seccomp=unconfined
works in my case. See this mail thread for more.
Before I realize that, I spent many hours digging into the "No valid Negotiate header in server response" issue. Finally I found out that it is because apache is using a private /tmp dir, and we symlink /var/lib/gssproxy
to /tmp
, so apache cannot contact gssproxy.
It works with a systemd unit override:
# /data/etc/systemd/system/httpd.service.d/override.conf
[Service]
PrivateTmp=false
I guess we should add this into the container image? But not sure where to add it.
We don't have any issue when we don't use --privileged
may because in that case, systemd in container does not have the privilege to create the private tmp, so it just ignores this.
I'm afraid running on CentOS / RHEL 7 hosts is no longer something we are able to reasonably support, especially with the new cgroups defaults.
The current
freeipa/freeipa-server:fedora-rawhide
container image (d9f32f01c6f0 from 3 hours ago) now never finishes booting, crash-loops inipa-server-configure-first.service
, and eventually gives up. This happens on a current CentOS 7 host with docker.