freeipa / freeipa-container

FreeIPA server in containers — images at https://quay.io/repository/freeipa/freeipa-server?tab=tags
https://quay.io/repository/freeipa/freeipa-server?tab=tags
Apache License 2.0
615 stars 259 forks source link

ipa-replica-install command fails #95

Closed nfuentes closed 7 years ago

nfuentes commented 8 years ago

Hi!

i'm trying to setup an ipa replica on amazon AWS, but i'm having the following error:

[27/43]: restarting directory server
ipa         : CRITICAL Failed to restart the directory server (Command '/bin/systemctl restart dirsrv@WATEA-COM-AR.service' returned non-zero exit status 1). See the installation log for details.

This is an extract of the logfile:

2016-11-23T15:45:22Z DEBUG certmonger request is in state dbus.String(u'NEWLY_ADDED_READING_KEYINFO', variant_level=1)
2016-11-23T15:45:27Z DEBUG certmonger request is in state dbus.String(u'CA_UNCONFIGURED', variant_level=1)
2016-11-23T15:45:27Z DEBUG flushing ldapi://%2fvar%2frun%2fslapd-WATEA-COM-AR.socket from SchemaCache
2016-11-23T15:45:27Z DEBUG retrieving schema for SchemaCache url=ldapi://%2fvar%2frun%2fslapd-WATEA-COM-AR.socket conn=<ldap.ldapobject.SimpleLDAPObject instance at 0x7f6aafc176c8>
2016-11-23T15:45:28Z DEBUG   duration: 5 seconds
2016-11-23T15:45:28Z DEBUG   [27/43]: restarting directory server
2016-11-23T15:45:28Z DEBUG Starting external process
2016-11-23T15:45:28Z DEBUG args=/bin/systemctl --system daemon-reload
2016-11-23T15:45:28Z DEBUG Process finished, return code=0
2016-11-23T15:45:28Z DEBUG stdout=
2016-11-23T15:45:28Z DEBUG stderr=Failed to open /dev/tty: No such device or address

2016-11-23T15:45:28Z DEBUG Starting external process
2016-11-23T15:45:28Z DEBUG args=/bin/systemctl restart dirsrv@WATEA-COM-AR.service
2016-11-23T15:45:28Z DEBUG Process finished, return code=1
2016-11-23T15:45:28Z DEBUG stdout=
2016-11-23T15:45:28Z DEBUG stderr=Failed to open /dev/tty: No such device or address
Failed to open /dev/tty: No such device or address
Job for dirsrv@WATEA-COM-AR.service failed because the control process exited with error code. See "systemctl status dirsrv@WATEA-COM-AR.service" and "journalctl -xe" for details.

2016-11-23T15:45:28Z CRITICAL Failed to restart the directory server (Command '/bin/systemctl restart dirsrv@WATEA-COM-AR.service' returned non-zero exit status 1). See the installation log for details.
2016-11-23T15:45:29Z DEBUG   duration: 0 seconds
2016-11-23T15:45:29Z DEBUG   [28/43]: setting up initial replication
2016-11-23T15:45:39Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 447, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 437, in run_step
    method()
  File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 405, in __setup_replica
    self.dm_password)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", line 114, in enable_replication_version_checking
    conn.do_simple_bind(bindpw=dirman_passwd)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1621, in do_simple_bind
    self.__bind_with_wait(self.simple_bind, timeout, binddn, bindpw)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1616, in __bind_with_wait
    self.__wait_for_connection(timeout)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1599, in __wait_for_connection
    wait_for_open_socket(lurl.hostport, timeout)
  File "/usr/lib/python2.7/site-packages/ipapython/ipautil.py", line 1291, in wait_for_open_socket
    raise e
error: [Errno 111] Connection refused

2016-11-23T15:45:39Z DEBUG   [error] error: [Errno 111] Connection refused
2016-11-23T15:45:39Z DEBUG   File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 171, in execute
    return_value = self.run()
  File "/usr/lib/python2.7/site-packages/ipapython/install/cli.py", line 318, in run
    cfgr.run()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 310, in run
    self.execute()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 332, in execute
    for nothing in self._executor():
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 372, in __runner
    self._handle_exception(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 394, in _handle_exception
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 362, in __runner
 step()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 359, in <lambda>
    step = lambda: next(self.__gen)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from
    value = gen.send(prev_value)
  File "/usr/lib/python2.7/site-packages/ipapython/install/common.py", line 63, in _install
    for nothing in self._installer(self.parent):
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/replicainstall.py", line 1687, in main
    promote(self)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/replicainstall.py", line 377, in decorated
    func(installer)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/replicainstall.py", line 1393, in promote
    promote=True, pkcs12_info=dirsrv_pkcs12_info)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/replicainstall.py", line 125, in install_replica_ds
    promote=promote,
  File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 399, in create_replica
    self.start_creation(runtime=60)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 447, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 437, in run_step
    method()
  File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 405, in __setup_replica
    self.dm_password)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", line 114, in enable_replication_version_checking
    conn.do_simple_bind(bindpw=dirman_passwd)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1621, in do_simple_bind
    self.__bind_with_wait(self.simple_bind, timeout, binddn, bindpw)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1616, in __bind_with_wait
    self.__wait_for_connection(timeout)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1599, in __wait_for_connection
    wait_for_open_socket(lurl.hostport, timeout)
  File "/usr/lib/python2.7/site-packages/ipapython/ipautil.py", line 1291, in wait_for_open_socket
    raise e

2016-11-23T15:45:39Z DEBUG The ipa-replica-install command failed, exception: error: [Errno 111] Connection refused
2016-11-23T15:45:39Z ERROR [Errno 111] Connection refused
2016-11-23T15:45:39Z ERROR The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information

i'm launching the container with the following docker command:

sudo docker run --privileged --name freeipa-server-container -ti -h heracles.watea.com.ar --dns=192.168.10.64 --dns=192.168.10.28 -e IPA_SERVER_IP=192.168.10.64 -v /sys/fs/cgroup:/sys/fs/cgroup:ro -v /etc/hosts:/etc/hosts --tmpfs /run --tmpfs /tmp -p 53:53/udp -p 53:53 -p 80:80 -p 443:443 -p 389:389 -p 636:636 -p 88:88 -p 464:464 -p 88:88/udp -p 464:464/udp -p 123:123/udp -p 7389:7389 -p 9443:9443 -p 9444:9444 -p 9445:9445 --network host -v /var/lib/ipa-data:/data freeipa-server ipa-replica-install --no-host-dns --skip-conncheck --admin-password=Dx90puns --allow-zone-overlap

I've read that it's not suggested to run it with privileged mode, but if I remove that parameter, I can't launch it. Docker is running on a centos 7 host

Any ideas?

Thanks!

nfuentes commented 8 years ago
  1. Can you be more specific about the "can't launch it"?
  2. What does systemctl status dirsrv@WATEA-COM-AR.service show?

1) When I run the command i've written, it starts creating the docker container, but when it starts the replication, it fails with:

[27/43]: restarting directory server
ipa         : CRITICAL Failed to restart the directory server (Command '/bin/systemctl restart dirsrv@WATEA-COM-AR.service' returned non-zero exit status 1). See the installation log for details.

It enrolls the container in the ipa server, but fails when upgrading a client to a master replica.

2)

[root@watea /]# systemctl status dirsrv@WATEA-COM-AR.service
* dirsrv@WATEA-COM-AR.service - 389 Directory Server WATEA-COM-AR.
   Loaded: loaded (/usr/lib/systemd/system/dirsrv@.service; bad; vendor preset: 
disabled)
   Active: inactive (dead)

Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.295441976 +000
0] Error: betxnpostoperation plugin referential integrity postoperation is not s
tarted
Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.297979285 +000
0] Error: object plugin Roles Plugin is not started
Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.300329723 +000
0] Error: preoperation plugin sudorule name uniqueness is not started
Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.302510509 +000
0] Error: object plugin USN is not started
Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.304886222 +000
0] Error: object plugin Views is not started
Nov 23 17:22:57 watea.com.ar ns-slapd[715]: [23/Nov/2016:17:22:57.306897292 +000
0] Error: extendedop plugin whoami is not started
Nov 23 17:22:57 watea.com.ar systemd[1]: dirsrv@WATEA-COM-AR.service: Ma
in process exited, code=exited, status=1/FAILURE
Nov 23 17:22:57 watea.com.ar systemd[1]: Failed to start 389 Directory S
erver WATEA-COM-AR..
Nov 23 17:22:57 watea.com.ar systemd[1]: dirsrv@WATEA-COM-AR.service: Un
it entered failed state.
Nov 23 17:22:57 watea.com.ar systemd[1]: dirsrv@WATEA-COM-AR.service: Failed with result 'exit-code'.

I've executed that command from within the container

nfuentes commented 8 years ago

When I run the container without --privileged, docker shows these errors:

Failed to determine whether /sys is a mount point: Operation not permitted
Failed to determine whether /proc is a mount point: Operation not permitted
Failed to determine whether /dev is a mount point: Operation not permitted
Failed to determine whether /dev/shm is a mount point: Operation not permitted
Failed to determine whether /run is a mount point: Operation not permitted
Failed to determine whether /sys/fs/cgroup is a mount point: Operation not permitted
Failed to determine whether /sys/fs/cgroup/systemd is a mount point: Operation not permitted
[!!!!!!] Failed to mount API filesystems, freezing.
Freezing execution.

I run docker with a sudoer user

adelton commented 8 years ago

When I run the container without --privileged, docker shows these errors:

Failed to determine whether /sys is a mount point: Operation not permitted Failed to determine whether /proc is a mount point: Operation not permitted Failed to determine whether /dev is a mount point: Operation not permitted Failed to determine whether /dev/shm is a mount point: Operation not permitted Failed to determine whether /run is a mount point: Operation not permitted Failed to determine whether /sys/fs/cgroup is a mount point: Operation not permitted

This is weird. We probably should start from here, investigating why systemd is not running in container for you.

nfuentes commented 8 years ago

What else should I try?

The host OS is Centos 7.2

adelton commented 8 years ago

I don't know. What docker package do you use?

nfuentes commented 8 years ago

I don't know. What docker package do you use?

I'm running Docker version 1.12.3, build 6b644ec in the host, and I created the Image with the default Dockerfile. So the docker image is the Fedora 24 spin.

If I try another Dockerfile, will I get the same FreeIPA version? I 've tried with that docker image, because I know it has the latest version.

mmarzantowicz commented 8 years ago

Looks familiar to https://github.com/adelton/docker-freeipa/issues/84#issuecomment-231694743 . Might be caused by some seccomp misconfiguration or other security mechanism.

nfuentes commented 8 years ago

I've just tried what you sugested.

If I run the container with --security-opt seccomp=unconfined, docker show these errors:

Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/fedora-domainname.service: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/systemd-journald.service: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/fedora-readonly.service: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-timer_list.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/etc-hostname.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-fs.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/dev-mqueue.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-sched_debug.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-kcore.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/data.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/-.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/data-var-log-journal.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-timer_stats.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/etc-resolv.conf.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-sysrq\x2dtrigger.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/tmp.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-bus.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-irq.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/etc-hosts.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/proc-asound.mount: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/init.scope: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/systemd-tmpfiles-setup.service: Operation not permitted
Failed to reset devices.list on /docker/631ca2220638244481f0769eb27c8e51b7282665294ffd5fa18353d647394833/system.slice/fedora-readonly.service: Operation not permitted
adelton commented 8 years ago

But it runs nonetheless, doesn't it?

nfuentes commented 8 years ago

It doesn't fail the same way as when I don't add --privileged to docker run command, but it doesn't start installing neither freeipa-client nor freeipa-server. It just hangs after those errors.

nfuentes commented 8 years ago

Does anyone have any suggestion to try to resolve this issue?

adelton commented 7 years ago

I've tried RHEL 7.2 with docker 1.12 from https://yum.dockerproject.org/repo/main/centos/7/ per https://docs.docker.com/engine/installation/linux/centos/. Merely running systemd in the container fails:

# docker run --rm -ti --tmpfs /run --tmpfs /run -e container=docker fedora:24 /usr/sbin/init
Failed to determine whether /sys is a mount point: Operation not permitted
Failed to determine whether /proc is a mount point: Operation not permitted
Failed to determine whether /dev is a mount point: Operation not permitted
Failed to determine whether /dev/shm is a mount point: Operation not permitted
Failed to determine whether /run is a mount point: Operation not permitted
Failed to determine whether /sys/fs/cgroup is a mount point: Operation not permitted
Failed to determine whether /sys/fs/cgroup/systemd is a mount point: Operation not permitted
[!!!!!!] Failed to mount API filesystems, freezing.
Freezing execution.
# docker run --rm -ti --tmpfs /run --tmpfs /run -e container=docker --security-opt seccomp=unconfined fedora:24 /usr/sbin/init
systemd 229 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization docker.
Detected architecture x86-64.
Running with unpopulated /etc.

Welcome to Fedora 24 (Twenty Four)!

Set hostname to <d80e6baea9f2>.
Initializing machine ID from random generator.
Failed to populate /etc with preset unit settings, ignoring: No such file or directory
Failed to install release agent, ignoring: No such file or directory
Failed to create /docker/d80e6baea9f2e73ddfea05bd3eaf397dac5eee5687a29c47138113a5a306fb11/init.scope control group: Read-only file system
Failed to allocate manager object: Read-only file system
[!!!!!!] Failed to allocate manager object, freezing.
Freezing execution.

On the other hand, docker 1.10 from CentOS Extras per https://wiki.centos.org/Cloud/Docker works just fine:

# docker run --rm -ti --tmpfs /run --tmpfs /run -e container=docker fedora:24 /usr/sbin/init
systemd 229 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization docker.
Detected architecture x86-64.

Welcome to Fedora 24 (Twenty Four)!

Set hostname to <7919989736fa>.
[  OK  ] Reached target Local File Systems.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Reached target Encrypted Volumes.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Reached target Swap.
[...]

So I recommend going with the docker from CentOS Extras on your CentOS installation.

aistellar commented 7 years ago

I found a workaround for this issue.

First, why does ipa-replica-install command fail? This only happens in a privileged container. It doesn't matter if you're using docker 1.10 or later. certmonger was complaining about CA_UNCONFIGURED and could not issue a server certificate for the replica host. In /var/lib/certmonger/requests/, there is a request file. The last line is the error message.

ca_error=Error setting up ccache for "host" service on client using default keytab: Keytab contains no suitable keys for host/example.com@EXAMPLE.COM

The principal name should be host/ipa.example.com@EXAMPLE.COM instead.

Then I found that at the end of ipa server installation, somehow the transient hostname was set to example.com!

[root@example /]# hostnamectl
   Static hostname: ipa.example.com
Transient hostname: example.com
         Icon name: computer-container
           Chassis: container
        Machine ID: ...
           Boot ID: ...
    Virtualization: docker
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 4.8.0-1-amd64
      Architecture: x86-64

I don't how exactly this could happen. But I made a change to hostnamectl-wrapper to always skip invoking the original hostnamectl. Then the replica installation succeeded.

adelton commented 7 years ago

I'd consider such a change for hostnamectl-wrapper but I'd much rather people stopped running FreeIPA server containers as privileged. If there are some issues that prevent running the containers as unprivileged, please file them so that they can be investigated. But if things only fail when running as privileged, I'd actually consider it a good thing.

aistellar commented 7 years ago

For us stuck with the latest stable version of docker, running FreeIPA in a privileged container may be the only option now. I tried running it in a less privileged mode with the docker option "--cap-add SYS_ADMIN", but I couldn't get yubikey provisioning working.

adelton commented 7 years ago

Please file that as a separate issue with exact information about the versions of OS, docker, image, the commands you use, plus whatever which might help us reproduce and investigate the issue.