neoave / mrack

Multicloud use-case based multihost async provisioner for CIs and testing during development
Apache License 2.0
12 stars 14 forks source link

deploying with a fedora container on podman is broken #142

Closed abbra closed 3 years ago

abbra commented 3 years ago

I am trying to set up an environment with a fedora 35 container. Note that this is not a specialized container image but rather a normal registry.fedoraproject.org/fedora:35 image. This image does not have sshd in it.

What happens is that sshd restart in container fails and throws an exception. This exception is expected but the actual structure of the argumens of the exception is unexpected:

  File "/usr/lib/python3.9/site-packages/mrack/actions/up.py", line 97, in provision
    failed_providers.append(results.args[PROVIDER_NAME_INDEX])

Below is full debug output, I added a line to print the results.args, thus the original failure was on line 96:

$ /usr/bin/mrack --debug -c ~/.mrack/mrack.conf up -m ~/todo/fedora-systemd/master-client.yaml 
Podman: Initializing provider
Podman: Init duration 0:00:00.000005
Provisioning started
Podman: Created requirement(s): [
    {
        "domain": "testrealm.test",
        "hostname": "master.testrealm.test",
        "image": "706171f56a3e",
        "name": "master.testrealm.test"
    },
    {
        "domain": "testrealm.test",
        "hostname": "client.testrealm.test",
        "image": "706171f56a3e",
        "name": "client.testrealm.test"
    }
]
Podman: Preparing provider resources
Podman: Preparing network(s) {'mrack-testrealm-test'}
Podman: Pulling missing images {'706171f56a3e'}
Podman: All required images present
Podman: Validating hosts definitions
Podman: Host definitions valid
Podman: Checking available resources
Podman: Resource availability: OK
Podman: Issuing provisioning of 2 host(s)
Podman: Creating container for host: master.testrealm.test
Podman: Creating container for host: client.testrealm.test
Podman: Provisioning issued
Podman: Waiting for all hosts to be active
Podman: 1ce9690bf8449cf3d4e78ec2b700f46297140b8b620c299c35134de02f9d374d was provisioned in 0.1s
Podman: 6b6d065c6c93d8238056fd008c6bb043881f77aa3775e397aa949706a18f9421 was provisioned in 0.3s
results are ('Failed restarting sshd service in container 1ce9690bf8449cf3d4e78ec2b700f46297140b8b620c299c35134de02f9d374d',)
tuple index out of range
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 201, in handle
    ret_code = func(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 225, in run
    mrackcli(obj={})  # pylint: disable=no-value-for-parameter,unexpected-keyword-arg
  File "/usr/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 62, in wrapper
    return loop.run_until_complete(func(*args, **kwargs))
  File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 122, in up
    await up_action.provision()
  File "/usr/lib/python3.9/site-packages/mrack/actions/up.py", line 97, in provision
    failed_providers.append(results.args[PROVIDER_NAME_INDEX])
IndexError: tuple index out of range
Traceback (most recent call last):
  File "/usr/bin/mrack", line 23, in <module>
    run.run()
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 214, in handle
    raise exc
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 201, in handle
    ret_code = func(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 225, in run
    mrackcli(obj={})  # pylint: disable=no-value-for-parameter,unexpected-keyword-arg
  File "/usr/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 62, in wrapper
    return loop.run_until_complete(func(*args, **kwargs))
  File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/lib/python3.9/site-packages/mrack/run.py", line 122, in up
    await up_action.provision()
  File "/usr/lib/python3.9/site-packages/mrack/actions/up.py", line 97, in provision
    failed_providers.append(results.args[PROVIDER_NAME_INDEX])
IndexError: tuple index out of range

So the ProvisioningError thrown by the podman's provider has args of ('Failed restarting sshd service in container 1ce9690bf8449cf3d4e78ec2b700f46297140b8b620c299c35134de02f9d374d',), e.g. only one element while PROVIDER_NAME_INDEX is 1, e.g. second element.

Almost all places that throw ProvisioningError accept a single argument, there is no provider name in them and thus the code in the actions/up.py is basically never going to work well. I guess this is because podman provider wasn't really tested for negative behavior before.

abbra commented 3 years ago

It would be great to fix the code to report these kind of internal errors in a proper way so that a user is aware what is failing rather than getting a stack trace.

pvoborni commented 3 years ago

Hi, thanks for the report.

Yes this exception handling needs to be improved.

The issue is also about connection check not working with containers without sshd. From your PoV, should we also prioritize this use case? The original thinking was that the initial use-cases were for testing IPA and related, where a lot depends on having a possibility to SSH into the host thus it was prioritized more. But we could do some enhancements here as well. E.g. skipping the ssh check or using the more native container ways.

abbra commented 3 years ago

For containers, going inside with native podman tools would be preferred. I am trying to use podman as a backend because I have quite bad experience with vagrant in my setup and I simply don't want to use it. I can avoid the problem in this report by creating a specialized image with sshd in it but this would make the configuration less portable. What I am using this for? I am trying to develop tests that run both Samba AD and FreeIPA in the same setup as we have an issue upstream (https://bugzilla.samba.org/show_bug.cgi?id=14851) that is potentially preventing a merge of couple pull requests (one for Samba to split DCE RPC handlers, one for FreeIPA to add IPA domain controller code in Samba).

abbra commented 3 years ago

FYI, this is also needed for testing external IdP stuff.