SSSD / sssd-test-suite

Setup virtual environment for testing SSSD against LDAP, IPA and Active Directory servers.
7 stars 16 forks source link

Provisioning failure with ipa guest #21

Closed justin-stephenson closed 2 years ago

justin-stephenson commented 3 years ago

I am trying to redeploy my test suite of an IPA server, and client but I see the following error. Any suggestions for further troubleshooting is appreciated.

$ ./sssd-test-suite provision enroll client ipa
[sssd-test-suite] [enroll] [1/2] Start Guest Machines
Bringing machine 'client' up with 'libvirt' provider...
Bringing machine 'ipa' up with 'libvirt' provider...
==> client: Checking if box 'sssd-vagrant/fedora34-client' version '20210324.01' is up to date...
==> ipa: Checking if box 'sssd-vagrant/fedora34-ipa' version '20210324.01' is up to date...
==> client: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> client: flag to force provisioning. Provisioners marked to run always will still run.
==> ipa: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> ipa: flag to force provisioning. Provisioners marked to run always will still run.
[sssd-test-suite] [enroll] [2/2] Enroll Machines
BECOME password: 

PLAY [ipa] *******************************************************************************************************************************************************************************************************************************************************************************************************************

TASK [enroll-ipa : Create /shared/enrollment/ipa directory] ******************************************************************************************************************************************************************************************************************************************************************
ok: [ipa]

TASK [enroll-ipa : Copy certificate to shared folder] ************************************************************************************************************************************************************************************************************************************************************************
fatal: [ipa]: FAILED! => changed=false 
  msg: Source /etc/ipa/ca.crt not found

PLAY RECAP *******************************************************************************************************************************************************************************************************************************************************************************************************************
ipa                        : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

[sssd-test-suite] [enroll] ERROR ShellCommandError: Command returned non-zero status code: 2
[sssd-test-suite] [enroll] Finished with error ShellCommandError: Command returned non-zero status code: 2
[sssd-test-suite] The following command exited with: 2
[sssd-test-suite] [shell] Working directory: /home/justin/github/sssd-test-suite
[sssd-test-suite] [shell] Environment: ANSIBLE_CONFIG='/home/justin/github/sssd-test-suite/provision/ansible.cfg'
[sssd-test-suite] [shell] Command: ['ansible-playbook', '--limit', 'client,ipa,localhost', '--skip-tags', 'enroll-ldap,enroll-ad,enroll-ad-child', '--ask-become-pass', '/home/justin/github/sssd-test-suite/provision/enroll.yml']
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/nutcli/runner.py", line 240, in execute
    return self._call_actor(args.func, args, shell)
  File "/usr/local/lib/python3.8/site-packages/nutcli/runner.py", line 282, in _call_actor
    return actor(**actor._filter_parser_args(args))
  File "/home/justin/github/sssd-test-suite/cli/commands/provision.py", line 184, in __call__
    TaskList('enroll', logger=self.logger)([
  File "/usr/local/lib/python3.8/site-packages/nutcli/tasks.py", line 187, in execute
    self.__real_handler(kwargs)(*real_args, **real_kwargs)
  File "/usr/local/lib/python3.8/site-packages/nutcli/tasks.py", line 401, in _run_tasks
    raise error.with_traceback(error_info[2])
  File "/usr/local/lib/python3.8/site-packages/nutcli/tasks.py", line 383, in _run_tasks
    task.execute(parent=self)
  File "/usr/local/lib/python3.8/site-packages/nutcli/tasks.py", line 187, in execute
    self.__real_handler(kwargs)(*real_args, **real_kwargs)
  File "/home/justin/github/sssd-test-suite/cli/commands/provision.py", line 207, in enroll
    self._exec_ansible(
  File "/home/justin/github/sssd-test-suite/cli/commands/provision.py", line 49, in _exec_ansible
    return self.shell(['ansible-playbook', *args], env=env)
  File "/usr/local/lib/python3.8/site-packages/nutcli/shell.py", line 197, in __call__
    raise ShellCommandError(
nutcli.shell.ShellCommandError: Command returned non-zero status code: 2

If I ssh into the IPA guest I do not see any /etc/ipa/ca.crt file. This guest network cannot reach out to the internet also.

 justin  ~  github  sssd-test-suite  ./sssd-test-suite ssh ipa
Last login: Wed Mar 31 20:12:12 2021 from 192.168.100.1
[systemd]
Failed Units: 1
  ipa.service
[vagrant@master.ipa.vm ~]$ sudo su - 
Last login: Wed Mar 31 20:03:04 UTC 2021 on pts/0
[systemd]
Failed Units: 1
  ipa.service
[root@master.ipa.vm ~]# ll /etc/ipa/
total 20
drwx------. 2 root root 4096 Mar 24 11:41 custodia
drwxr-xr-x. 2 root root 4096 Mar 24 11:49 dnssec
drwxr-xr-x. 2 root root 4096 Mar 24 11:35 html
drwxr-xr-x. 2 root root 4096 Mar 24 11:48 kdcproxy
drwxr-xr-x. 2 root root 4096 Mar 24 11:49 nssdb
[root@master.ipa.vm ~]# ll /etc/ipa/ca.crt
ls: cannot access '/etc/ipa/ca.crt': No such file or directory
[root@master.ipa.vm ~]# ping google.com
ping: google.com: Name or service not known

Trying to provision the guest fails due to a network issue, it's not clear to me if this is the root of the issue, or a symptom.

 justin  ~  github  sssd-test-suite  ./sssd-test-suite provision guest ipa

PLAY [local:linux] ***********************************************************************************************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *******************************************************************************************************************************************************************************************************************************************************************************************************
ok: [ipa]

TASK [python : Python interpreter] *******************************************************************************************************************************************************************************************************************************************************************************************
ok: [ipa] => 
  msg: /usr/bin/python3 (3.9.2)

PLAY [ipa:ldap:client] *******************************************************************************************************************************************************************************************************************************************************************************************************

TASK [Upgrade all packages to their latest version] **************************************************************************************************************************************************************************************************************************************************************************
FAILED - RETRYING: Upgrade all packages to their latest version (3 retries left).
FAILED - RETRYING: Upgrade all packages to their latest version (2 retries left).
FAILED - RETRYING: Upgrade all packages to their latest version (1 retries left).
fatal: [ipa]: FAILED! => changed=false 
  attempts: 3
  msg: 'Failed to download metadata for repo ''fedora-modular'': Cannot prepare internal mirrorlist: Curl error (6): Couldn''t resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-34&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]'
  rc: 1
  results: []

PLAY RECAP *******************************************************************************************************************************************************************************************************************************************************************************************************************
ipa                        : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

[sssd-test-suite] The following command exited with: 2
[sssd-test-suite] [shell] Working directory: /home/justin/github/sssd-test-suite
[sssd-test-suite] [shell] Environment: ANSIBLE_CONFIG='/home/justin/github/sssd-test-suite/provision/ansible.cfg'
[sssd-test-suite] [shell] Command: ['ansible-playbook', '--limit', 'ipa', '/home/justin/github/sssd-test-suite/provision/prepare-guests.yml']
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/nutcli/runner.py", line 240, in execute
    return self._call_actor(args.func, args, shell)
  File "/usr/local/lib/python3.8/site-packages/nutcli/runner.py", line 282, in _call_actor
    return actor(**actor._filter_parser_args(args))
  File "/home/justin/github/sssd-test-suite/cli/commands/provision.py", line 127, in __call__
    self._exec_ansible(playbook, unattended=True, limit=guests, argv=argv)
  File "/home/justin/github/sssd-test-suite/cli/commands/provision.py", line 49, in _exec_ansible
    return self.shell(['ansible-playbook', *args], env=env)
  File "/usr/local/lib/python3.8/site-packages/nutcli/shell.py", line 197, in __call__
    raise ShellCommandError(
nutcli.shell.ShellCommandError: Command returned non-zero status code: 2

The client system appears to be working fine. My config.json is as follows.

{
  "boxes": {
    "ad": {
      "name": "peru/windows-server-2019-datacenter-x64-eval",
      "url": "",
      "memory": 2048
    },
    "ad-child": {
      "name": "peru/windows-server-2019-datacenter-x64-eval",
      "url": "",
      "memory": 2048
    },
    "ipa": {
      "name": "sssd-vagrant/fedora34-ipa",
      "url": "",
      "memory": 2048
    },
    "ldap": {
      "name": "sssd-vagrant/fedora34-ldap",
      "url": "",
      "memory": 1024
    },
    "client": {
      "name": "sssd-vagrant/fedora34-client",
      "url": "",
      "memory": 2048
    }
  },
  "folders": {
    "sshfs": [],
    "rsync": [],
    "nfs": []
  }
}
pbrezina commented 3 years ago

It looks like IPA does not work correctly on Fedora 34.

[pbrezina ~]$ sts ssh ipa
Last login: Thu Apr  1 08:56:11 2021 from 192.168.100.1
[systemd]
Failed Units: 1
  ipa.service
[vagrant@master.ipa.vm ~]$ sudo su
[systemd]
Failed Units: 1
  ipa.service
[root@master.ipa.vm /home/vagrant]# systemctl status ipa.service
× ipa.service - Identity, Policy, Audit
     Loaded: loaded (/usr/lib/systemd/system/ipa.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Thu 2021-04-01 08:55:45 UTC; 8min ago
    Process: 567 ExecStart=/usr/sbin/ipactl start (code=exited, status=1/FAILURE)
   Main PID: 567 (code=exited, status=1/FAILURE)
        CPU: 563ms

Apr 01 08:55:42 master.ipa.vm systemd[1]: Starting Identity, Policy, Audit...
Apr 01 08:55:45 master.ipa.vm ipactl[567]: Unexpected error
Apr 01 08:55:45 master.ipa.vm ipactl[567]: AttributeError: 'Env' object has no attribute 'basedn'
Apr 01 08:55:45 master.ipa.vm systemd[1]: ipa.service: Main process exited, code=exited, status=1/FAILURE
Apr 01 08:55:45 master.ipa.vm systemd[1]: ipa.service: Failed with result 'exit-code'.
Apr 01 08:55:45 master.ipa.vm systemd[1]: Failed to start Identity, Policy, Audit.
abbra commented 3 years ago

FreeIPA works just fine on F34, you can see it with the recent OpenQA tests on F34: https://openqa.fedoraproject.org/tests/837588#dependencies (out of https://bodhi.fedoraproject.org/updates/FEDORA-2021-04b050e3d1).

What you show in the logs is a behavior on non-enrolled machine. Specifically, running systemctl status ipa.service on a machine that is not an IPA server is expected to produce an error.

pbrezina commented 3 years ago

I re-run the scripts and double check that IPA installation succeeded. But the service does not work and running ipa-server-install against says that it is already configured.

[root@master.ipa.vm /home/vagrant]# ipa-server-install 

The log file for this installation can be found in /var/log/ipaserver-install.log
IPA server is already configured on this system.
If you want to reinstall the IPA server, please uninstall it first using 'ipa-server-install --uninstall'.
The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information

We use this step to install it, has something changed?

The installation was successful:

[sssd-ci]   [fedora34]   [ipa]   TASK [ipa : Install IPA server] 
[sssd-ci]   [fedora34]   [ipa]   changed: [ipa] 

But obviously something is missing from the system:

[root@master.ipa.vm /etc/ipa]# ll -R /etc/ipa
/etc/ipa:
total 20
drwx------. 2 root root 4096 Apr  1 09:52 custodia
drwxr-xr-x. 2 root root 4096 Apr  1 10:01 dnssec
drwxr-xr-x. 2 root root 4096 Apr  1 09:47 html
drwxr-xr-x. 2 root root 4096 Apr  1 10:00 kdcproxy
drwxr-xr-x. 2 root root 4096 Apr  1 10:02 nssdb

/etc/ipa/custodia:
total 8
-rw-rw----. 1 root root  625 Apr  1 09:52 custodia.conf
-rw-------. 1 root root 3325 Apr  1 09:52 server.keys

/etc/ipa/dnssec:
total 16
-r--r-----. 1 root ods   524 Apr  1 10:01 ipa-dnskeysyncd.keytab
-rw-r-----. 1 root named 423 Apr  1 10:01 openssl.cnf
-rw-r--r--. 1 root root  145 Apr  1 10:01 softhsm2.conf
-r--------. 1 root root   30 Apr  1 10:01 softhsm_pin_so

/etc/ipa/html:
total 16
-rw-r--r--. 1 root root 8198 Mar 31 06:20 ssbrowser.html
-rw-r--r--. 1 root root 2719 Mar 31 06:20 unauthorized.html

/etc/ipa/kdcproxy:
total 8
-rw-r--r--. 1 root root 1088 Apr  1 10:00 ipa-kdc-proxy.conf
-rw-r--r--. 1 root root   40 Mar 31 06:20 kdcproxy.conf

/etc/ipa/nssdb:
total 72
-rw-r--r--. 1 root root 28672 Apr  1 10:02 cert9.db
-rw-r--r--. 1 root root 36864 Apr  1 10:02 key4.db
-rw-r--r--. 1 root root   421 Apr  1 10:02 pkcs11.txt
-rw-------. 1 root root    41 Apr  1 10:02 pwdfile.txt
abbra commented 3 years ago

Can you pick up systemd journal and /var/log/ipa*.log files?

pbrezina commented 3 years ago

Alexander checked the system and it looks that disk content is not flushed correctly before vagrant halts the vm.

pbrezina commented 3 years ago

This is still an issue but it happens only in openstack automation inside nested virtualization. It is fine locally. I still don't know what to do with it.