freeipa / freeipa-container

FreeIPA server in containers — images at https://quay.io/repository/freeipa/freeipa-server?tab=tags
https://quay.io/repository/freeipa/freeipa-server?tab=tags
Apache License 2.0
609 stars 259 forks source link

Cant upgrade from 4.9.2 to 4.9.6 #417

Closed k-s-dean closed 3 years ago

k-s-dean commented 3 years ago

Hi There,

Currently been trying to upgrade my FreeIPA installation from 4.9.2 to 4.9.6 however I have been running into issues. Environment: # Host BaseOS: ubuntu 20.04 LTS Kernel: 5.4.0-72-generic Docker version: 19.03.12

# Old image FreeIPA build: 4.9.2 from 4 months ago ( Works ) Container Image: Fedora 33 Repo cloned directly from here

# New Image FreeIPA build: 4.9.6 Container Image: Both Fedora 33 and Fedora 34 have been built Repo git pull to update to latest code

Issues:


* However after trying to log in after the succesful upgrade I'm faced with 
  * `“Login failed due to an unknown reason.”` at the FreeIPA web interface
* Looking at the httpd/apache2 logs I see

[Mon Aug 16 18:51:59.620641 2021] [:warn] [pid 254:tid 295] [client 10.8.0.9:40474] KRB5CCNAME file (/run/ipa/ccaches/USER>@<DOMAIN-PFME92) lookup failed!, referer: https://auth01./ipa/ui/ [Mon Aug 16 18:52:04.288357 2021] [auth_gssapi:error] [pid 399:tid 412] [client 127.0.0.1:60342] GSS ERROR gss_acquire_cred[_from]() failed to get server creds: [Unspecified GSS failure. Minor code may provide more information ( SPNEGO cannot find mechanisms to negotiate)] [Mon Aug 16 18:52:04.288357 2021] [auth_gssapi:error] [pid 399:tid 412] [client 127.0.0.1:60342] GSS ERROR gss_acquire_cred[_from]() failed to get server creds: [Unspecified GSS failure. Minor code may provide more information ( SPNEGO cannot find mechanisms to negotiate)] [Mon Aug 16 18:52:04.289615 2021] [wsgi:error] [pid 253:tid 492] [remote 10.8.0.9:40472] ipa: INFO: 401 Unauthorized: No session cookie found [Mon Aug 16 18:54:31.378132 2021] [wsgi:error] [pid 251:tid 486] [remote 10.8.0.9:40480] ipa: INFO: [jsonserver_i18n_messages] UNKNOWN: i18n_messages(version='2.242'): SUCCESS


* I've Tested ldapsearch and kinit those both work without issue but I'm no longer able to login.
I've since reverted the system back to its previous state as its currently being used, by a number of services I have running, however I have the following questions regarding the changes to FreeIPA since the last time I upgraded. 

1. What has changed from my current working version that does not require `privileged: true` to the container build that now requires `privileged: true` ? 
2. Googling the GSS failure above, I'm seeing reports about this being to do with httpd keytab ? but I'm not sure.  
3. Is there any where else I can look to try and pin point the issue, or provide further information.

Any help would be appreciated. 

Kind regards, 
Kyle
adelton commented 3 years ago

Issues:

  • When booting the new image I instantly run into this
    • FreeIPA server is already configured but with different version, volume update.

This is not an issue -- it's an expected message indicating that the image has changed and the container needs to do some extra housekeeping to match the new image.

* After debugging the issue it appears that now for the container to run successfully I must supply `privileged:true` in my docker-compose file. Once I have done this FreeIPA begins the upgrade.

No, never use privileged: true. Where did not find an advice to use that?

What is the actual problem that you hit? What was the error message?

Can you run

tests/run-partial-tests.sh Dockerfile.fedora-33

or

replica=none tests/run-master-and-replica.sh <the-new-image>

to test behaviour of the new image?

* However after trying to log in after the succesful upgrade I'm faced with
  * `“Login failed due to an unknown reason.”` at the FreeIPA web interface
* Looking at the httpd/apache2 logs I see
[Mon Aug 16 18:51:59.620641 2021] [:warn] [pid 254:tid 295] [client 10.8.0.9:40474] KRB5CCNAME file (/run/ipa/ccaches/<USER>@<DOMAIN>-PFME92) lookup failed!, referer: https://auth01.<DOMAIN>/ipa/ui/
[Mon Aug 16 18:52:04.288357 2021] [auth_gssapi:error] [pid 399:tid 412] [client 127.0.0.1:60342] GSS ERROR gss_acquire_cred[_from]() failed to get server creds: [Unspecified GSS failure.  Minor code may provide more information ( SPNEGO cannot find mechanisms to negotiate)]
[Mon Aug 16 18:52:04.288357 2021] [auth_gssapi:error] [pid 399:tid 412] [client 127.0.0.1:60342] GSS ERROR gss_acquire_cred[_from]() failed to get server creds: [Unspecified GSS failure.  Minor code may provide more information ( SPNEGO cannot find mechanisms to negotiate)]
[Mon Aug 16 18:52:04.289615 2021] [wsgi:error] [pid 253:tid 492] [remote 10.8.0.9:40472] ipa: INFO: 401 Unauthorized: No session cookie found
[Mon Aug 16 18:54:31.378132 2021] [wsgi:error] [pid 251:tid 486] [remote 10.8.0.9:40480] ipa: INFO: [jsonserver_i18n_messages] UNKNOWN: i18n_messages(version='2.242'): SUCCESS

Yep, I'd consider it expected with privileged containers. Don't use them.

1. What has changed from my current working version that does not require `privileged: true` to the container build that now requires `privileged: true` ?

Nothing requires privileged. There might be some new syscalls needed with newer glibc so update of seccomp policies might be required for the container runtime (but it's weird that you'd see that while staying on the same OS (Fedora 33) image) and disabling seccomp might be a temporary measure, but using privileged is nearly never the right answer.

2. Googling the GSS failure above, I'm seeing reports about this being to do with httpd keytab ? but I'm not sure.

Right. Don't use privileged.

3. Is there any where else I can look to try and pin point the issue, or provide further information.

https://github.com/freeipa/freeipa-container#debugging

k-s-dean commented 3 years ago

Hi Adelton,

Thanks for getting back to me.

Just to emphasise, I totally agree that the container should not be running as privileged.

Issues:

  • When booting the new image I instantly run into this
  • FreeIPA server is already configured but with different version, volume update.

This is not an issue -- it's an expected message indicating that the image has changed and the container needs to do some extra housekeeping to match the new image.

I can see the message being printed and I agree its not a an Error, the reason I specified that particular text is because the container exists straight after that message. Sorry I should have provided my debug log ( see below ).

When I run the container as privileged the upgrade starts and completes. If i dont run the container as privileged, the container just constantly restarts and never gets past this point exec /usr/sbin/init --show-status=false --unit=ipa-server-upgrade.service

+ echo 'FreeIPA server is already configured but with different version, volume update.'
FreeIPA server is already configured but with different version, volume update.
FreeIPA server is already configured but with different version, volume update.
+ echo 'FreeIPA server is already configured but with different version, volume update.'
+ echo 'Tue Aug 17 10:22:39 UTC 2021 /usr/local/sbin/init '
+ SHOW_LOG=1
+ '[' 1 == 1 ']'
+ for i in /var/log/ipa-server-configure-first.log /var/log/ipa-server-run.log
+ '[' -f /var/log/ipa-server-configure-first.log ']'
+ for i in /var/log/ipa-server-configure-first.log /var/log/ipa-server-run.log
+ '[' -f /var/log/ipa-server-run.log ']'
+ trap '' SIGHUP
+ tail --silent -n 0 -f --retry /var/log/ipa-server-configure-first.log /var/log/ipa-server-run.log
+ '[' -n '' ']'
+ exec /usr/sbin/init --show-status=false --unit=ipa-server-upgrade.service
<<<< EXIT HERE NO ERROR

To run the container as non-privileged, Is the fix for this to disable cgroupsV2 ? I've seen that in the documentation. There is literally zero errors, I'm currently running my existing installation non-privileged, hence why I was asking whats changed.

Kind regards,

Kyle

adelton commented 3 years ago
  • exec /usr/sbin/init --show-status=false --unit=ipa-server-upgrade.service <<<< EXIT HERE NO ERROR

Can you add -e DEBUG_NO_EXIT=1 -e DEBUG_TRACE=1 to give us the container running so that you can investigate it some more?

To run the container as non-privileged, Is the fix for this to disable cgroupsV2 ? I've seen that in the documentation. There is literally zero errors, I'm currently running my existing installation non-privileged, hence why I was asking whats changed.

You should start by testing with tests/run-partial-tests.sh -- that should give us an indication of what still works and what fails (systemd, specific service, etc).

k-s-dean commented 3 years ago

Thanks for you help seems I've now resolved the issue by rebuilding the container. Somehow it must have got corrupted.

After running your test suite. I got the following which verified there was a problem with the container.

replica=none tests/run-master-and-replica.sh freeipa:latest2
+ umask 0007
+ docker=docker
+ sudo=sudo
+ BASE=ipa1
+ VOLUME=/tmp/freeipa-test-4622/data
+ IMAGE=freeipa:latest2
+ readonly_run=
++ id -u
+ '[' 0 '!=' 0 -a docker == podman -a none '!=' none ']'
+ '[' '' == --read-only ']'
+ '[' -f /tmp/freeipa-test-4622/data/build-id ']'
+ dns_opts='--auto-reverse --allow-zone-overlap'
+ '[' none = none ']'
+ dns_opts=
++ id -u
+ '[' 0 '!=' 0 -a docker == podman -a none '!=' none ']'
+ run_ipa_container freeipa:latest2 freeipa-master exit-on-finished -U -r EXAMPLE.TEST --setup-dns --no-forwarders --no-ntp
+ set +x
Tue Aug 17 11:55:35 UTC 2021
+ umask 0
+ docker run -d --name freeipa-master -v /sys/fs/cgroup:/sys/fs/cgroup:ro --sysctl net.ipv6.conf.all.disable_ipv6=0 -h ipa.example.test -v /tmp/freeipa-test-4622/data:/data:Z -e PASSWORD=Secret123 freeipa:latest2 exit-on-finished -U -r EXAMPLE.TEST --setup-dns --no-forwarders --no-ntp
80876b9602791c4f222f5e3a2c9a8182cad3387e621d16c0771fe23235fcf3ea
The container has exited with .State.ExitCode [255].
Tue Aug 17 11:55:48 UTC 2021

while the test which builds the container from scratch completed with out issue.

Password for admin@EXAMPLE.TEST: 
----------------
Added user "bob"
----------------
  User login: bob
  First name: Bob
  Last name: Nowak
  Full name: Bob Nowak
  Display name: Bob Nowak
  Initials: BN
  Home directory: /home/bob
  GECOS: Bob Nowak
  Login shell: /bin/sh
  Principal name: bob@EXAMPLE.TEST
  Principal alias: bob@EXAMPLE.TEST
  Email address: bob@example.test
  UID: 278600003
  GID: 278600003
  Password: False
  Member of groups: ipausers
  Kerberos keys available: False
uid=278600003(bob) gid=278600003(bob) groups=278600003(bob)
-rw-------  1 root root            3916246 Aug 17 11:54 /data/var/log/ipaserver-install.log
-rw-r-----+ 1 root systemd-journal 8388608 Aug 17 11:54 /data/var/log/journal/3f9d9c7d6f5b462081a3ddef08250c8d/system.journal
OK tests/systemd-container-ipa-server-install-data.sh.
OK tests/run-partial-tests.sh.

I'll add these commands to my troubleshooting document, they will come on helpful for the future.

Closing issue