freeipa / freeipa-container

FreeIPA server in containers — images at https://quay.io/repository/freeipa/freeipa-server?tab=tags
https://quay.io/repository/freeipa/freeipa-server?tab=tags
Apache License 2.0
614 stars 259 forks source link

Support for FreeIPA v. 4.6 #157

Closed zultron closed 6 years ago

zultron commented 7 years ago

It looks like the most recent FreeIPA version is in the fedora-25 tag, v. 4.4.4. When collaborating with the upstream project, I'm often asked to upgrade to the latest version 4.5 to reproduce a bug.

Is there any plan to update the containers?

I started a branch with a new f26 Dockerfile pointing at the official freeipa-4-5 COPR. Of course the upgrade is non-trivial, and while I've solved a few initial problems, the install can still fail nondeterministically in a few places during ipa-server-install.

adelton commented 7 years ago

There is Dockerfile.fedora-rawhide and Dockerfile.fedora-25-master-nightly in the repo which can be used (docker build -f Dockerfile.that-you-need).

Unfortunatelly, at this point even Fedora 26-based containers (built via Dockerfile.fedora-26) fail with things like

  [45/47]: activating extdom plugin
  [46/47]: tuning directory server
  [47/47]: configuring directory to start on boot
Done configuring directory server (dirsrv).
Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes 30 seconds
  [1/31]: creating certificate server user
  [2/31]: configuring certificate server instance
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL Failed to configure CA instance: Command '/usr/sbin/pkispawn -s CA -f /tmp/tmphw_qii' returned non-zero exit status 1
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL See the installation logs and the following files/directories for more information:
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL   /var/log/pki/pki-tomcat
  [error] RuntimeError: CA configuration failed.
ipa.ipapython.install.cli.install_tool(Server): ERROR    CA configuration failed.
ipa.ipapython.install.cli.install_tool(Server): ERROR    The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information
FreeIPA server configuration failed.

so even newer versions might have even worse stability issues.

abbra commented 7 years ago

Note that in Rawhide and F27 branched you should be OK when FreeIPA 4.6.0-2 packages reach them. You'd most likely need to rebuild your base docker images. Check following Bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-2017-a79e85e4d3, it is currently in pending state.

stlaz commented 7 years ago

Please note that FreeIPA 4.5 was never released into Fedora so it won't be available in containers. Fedora 27 container will have FreeIPA 4.6, though. @adelton is right that Fedora 26 container is currently failing and while I tried to figure out the issue, I was not able to do that... yet.

zultron commented 7 years ago

I've also seen the same non-deterministic behavior that @adelton and (I assume) @stlaz are seeing. I just filed a PR against FreeIPA 4.5 upping the dbus timeouts in a couple of places that often for me.

In the meantime, my freeipa-4.5 branch is updated to monkey-patch the system-installed FreeIPA python sources with those same changes; see the (currently) top commit, "Workaround installer failures for f26, FreeIPA 4.5", for f25 only.

There's a bunch of other stuff in that branch; if anything looks useful, I'll be happy to pick it out into a PR:

stlaz commented 7 years ago

@zultron Thanks for your continuous contributions, I very much appreciate it.

I'll put on my container hat tomorrow and will go through both your PRs (freeipa + freeipa-container), will try and test them, and will check whether we may adopt some of the changes in your git repo 👍

zultron commented 7 years ago

PR #158 submitted to fix /etc/dirsrv/schema relocation.

zultron commented 7 years ago

During ipa-server-install, those initial dbus timeouts seem to be resolved for me. However, the final ipa-client-install step now intermittently fails authenticating with LDAP; there are krb5kdc log messages like DISPATCH: repeated (retransmitted?) request from [IP], resending previous response; I'll have to capture and report more details next time.

During ipa-replica-install, a new failure. From ipa-server-configure-first.log:

Configuring directory server (dirsrv)
  [1/3]: configuring TLS for DS instance
  [error] RuntimeError: Certificate issuance failed (CA_UNCONFIGURED)

From ipareplica-install.log:

2017-09-14T05:16:32Z DEBUG args=/usr/bin/certutil -d /etc/dirsrv/slapd-ZULTRON-COM/ -A -n ZULTRON.COM IPA CA -t CT,C,C -a -f /etc/dirsrv/slapd-ZULTRON-COM/pwdfile.txt
2017-09-14T05:16:32Z DEBUG Process finished, return code=0
2017-09-14T05:16:32Z DEBUG stdout=
2017-09-14T05:16:32Z DEBUG stderr=
2017-09-14T05:16:32Z DEBUG certmonger request is in state dbus.String(u'NEWLY_ADDED_READING_KEYINFO', variant_level=1)
2017-09-14T05:16:37Z DEBUG certmonger request is in state dbus.String(u'CA_UNCONFIGURED', variant_level=1)
2017-09-14T05:16:37Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 504, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 494, in run_step
    method()
  File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 824, in __enable_ssl
    post_command=cmd)
  File "/usr/lib/python2.7/site-packages/ipalib/install/certmonger.py", line 317, in request_and_wait_for_cert
    raise RuntimeError("Certificate issuance failed ({})".format(state))
RuntimeError: Certificate issuance failed (CA_UNCONFIGURED)

At this point, I'm not sure if it's another race condition or if it's my unfamiliarity with FreeIPA 4.5. The ipa-replica-install-options file is unchanged from the working FreeIPA 4.4.0 file that worked with the centos-7 container:

--unattended
--principal=admin
--admin-password=supersecret
--server=host1.example.com
--hostname=host2.example.com
--realm=EXAMPLE.COM
--domain=example.com
--setup-ca
--setup-dns
--no-forwarders
--auto-reverse
--no-host-dns
--skip-conncheck
--no-ntp
--no-ui-redirect
--allow-zone-overlap
adelton commented 7 years ago

Any explanation why the D-Bus timeouts would be seen in the container setup and not on the host?

zultron commented 7 years ago

@adelton, the timeouts addressed in https://github.com/freeipa/freeipa/pull/1078 come from running on a memory-constrained system, not from running in a container per se. If there's any swapping going on as dogtag starts up, the default 25 second timeout halts the installation.

adelton commented 7 years ago

IIRC, @stlaz found some issue with keyring which affected the dogtag startup in containers. So the timeout tweaks should not be needed.

stlaz commented 7 years ago

Indeed. It seems this might be a regression in systemd-233. There's a Bugzilla now: https://bugzilla.redhat.com/show_bug.cgi?id=1492081

Some related discussions: https://github.com/systemd/systemd/issues/6281 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1691096

zultron commented 7 years ago

On 09/18/2017 02:23 AM, Jan Pazdziora wrote:

IIRC, @stlaz found some issue with keyring which affected the dogtag startup in containers. So the timeout tweaks should not be needed.

Yes he did. Initially I thought this was the same bug, since it happened around the same place in the installer, but it turns out not to be. My apologies for the confusion.

The timeouts I reported are here, and unfortunately won't be fixed with the dogtag issue @stlaz is working on: https://github.com/freeipa/freeipa/pull/1078

zultron commented 6 years ago

[After @stlaz's comment above, I renamed the issue to "... 4.6".]

I'm returning to this project after some time, but it looks like familiar issues preventing FreeIPA running in the 4.6 container:

2018-06-27T02:56:15Z DEBUG stderr=pkispawn    : ERROR    ....... subprocess.CalledProcessError:  Command '['sysctl', 'crypto.fips_enabled)', '-bn']' returned non-zero exit status 255!

Google shows this is a known issue at least amongst Ubuntu users, and it will be for Container Linux users as well.

If the other participants on this issue think we've stalled out here and wish to close it, that's ok with me.

adelton commented 6 years ago

Which image is this?

zultron commented 6 years ago

Sorry, the fedora-27 image with FreeIPA 4.6.

The Fedora-26 image still has FreeIPA 4.4.4, IIRC.

zultron commented 6 years ago

I went ahead and filed issues on Pagure for FreeIPA and Dogtag PKI for the /proc/sys/crypto issue, since this isn't a problem with the container per se, and is reported to affect LXC container installs, too.

zultron commented 6 years ago

I ended up filing a PR for the /proc/sys/crypto issue as well, already merged.

I haven't investigated the next failure carefully yet, also during the "configuring certificate server instance" step. I'd appreciate a quick look. (The "Property internaldb.ldapconn.port missing value" error is a red herring.) These logs are from a custom Dockerfile.fedora-27-copr that basically adds my FreeIPA COPR to Dockerfile.fedora-27, on Docker hub as zultron/freeipa-container:fedora-27-freeipa-4-6.

That COPR builds Dogtag PKI with the above PR cherry-picked, and FreeIPA 4.6 with my dbus client timeout patch cherry-picked from master.

pki-tomcat.ca.debug.log pki-tomcat.ca.system.log ipaserver-install.log ipa-server-install-options.txt ipa.service.txt

zultron commented 6 years ago

After a closer look, I'm not sure the "Property internaldb.ldapconn.port missing value" error is a red herring after all. I don't see any other errors, and the final one didn't have the "Swallow exception in pre-op mode" log message. Here's a full systemd journal.

systemd_journal.txt

adelton commented 6 years ago

@zutron, Fedora 27 and Fedora 28 images (Dockerfiles*) seem to be stable in our tests, I've fixed rawhide recently. Do they work for you? You've done a great job bringing the issues you've hit to the respective upstreams so I wonder if there is anything needed on container side?

zutron commented 6 years ago

@adelton Had me a little confused there with your typo! @zultron

adelton commented 6 years ago

Oops, sorry about that.

zultron commented 6 years ago

@adelton, Although I haven't succeeded in running the 4.6-based images in a plain CoreOS Docker container yet, I have no reason to believe it's because of this project. I'm trying a new approach in my own FreeIPA project using Atomic instead of CoreOS, and running in Kubernetes instead of plain Docker. That should be a better-proven environment to run FreeIPA.

Accordingly, I'm closing this issue. I really appreciate your and your team's support during this odyssey of mine through uncharted territory!

adelton commented 6 years ago

Thanks for the info. In general is seems useful to unwrap the layers used in the setup one by one, so if you will use Docker underneath Kubernetes (and not for example CRI-O), you might hit the Docker-related issues anyway. ;-)

LorbusChris commented 6 years ago

Shouldn't this issue stay open anyway till it is resolved? IMO that way it's easier to track..

adelton commented 6 years ago

@LorbusChris, an open issue is an indication for someone in the community (for example for me) to investigate something or attempt to fix something in this project (containerization of FreeIPA). What is it exactly that needs to be addressed in this project?

zultron commented 6 years ago

@LorbusChris Maybe I misunderstood something. Others have been reporting FreeIPA 4.6 does run in Docker, so I assumed this problem is something on my end at this point. For example, @adelton's comment says the F27 and F28 containers work in their testing, and the Fedora package database shows FreeIPA v. 4.6 packages in F27 and F28.

@adelton Can you confirm that your tests running FreeIPA 4.6 in Docker do succeed?

adelton commented 6 years ago

@zultron, yes, I have containers built from Dockerfile.fedora-27, Dockerfile.fedora-28, and Dockerfile.fedora-rawhide running in my tests. Whatever freeipa-server package versions are there, they get installed and run. I hope to add docker run steps to Travis CI in some reasonable timeframe to show the working setup on Travis' Ubuntu as well.

LorbusChris commented 6 years ago

@zultron @adelton then I think this was a misunderstanding on my side, so nmvd :)