projectatomic / atomic

Atomic Run Tool for installing/running/managing container images.
Other
524 stars 139 forks source link

traceback when trying to uninstall non-system container #1217

Closed miabbott closed 6 years ago

miabbott commented 6 years ago

Our automated tests caught this when trying to uninstall the cockpit container:

# atomic install registry.fedoraproject.org/f27/cockpit:latest
Pulling registry.fedoraproject.org/f27/cockpit:latest ...
Copying blob sha256:d445b8c354cc48e75ed621cb6783a80c29ac24135cdd98fd02ae70e1f18345bc
 80.81 MB / 80.81 MB [======================================================] 9s
Copying blob sha256:23c72130fe8153e9f6f0e1bbcc1b230bbf50d79d99a2790c9a65d7d1f01462c9
 149.96 MB / 149.96 MB [===================================================] 18s
Copying config sha256:5a08300420e7ce64065c97baeea7f7500535b16b1d0abb7609ef0e7a2ee811d1
 3.71 KB / 3.71 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures
/usr/bin/docker run --rm --privileged -v /:/host registry.fedoraproject.org/f27/cockpit:latest /container/atomic-install
+ sed -e /pam_selinux/d -e /pam_sepermit/d /etc/pam.d/cockpit
+ mkdir -p /host/etc/cockpit/ws-certs.d /host/etc/cockpit/machines.d
+ chmod 755 /host/etc/cockpit/ws-certs.d /host/etc/cockpit/machines.d
+ chown root:root /host/etc/cockpit/ws-certs.d /host/etc/cockpit/machines.d
+ mkdir -p /host/var/lib/cockpit
+ chmod 775 /host/var/lib/cockpit
+ chown root:wheel /host/var/lib/cockpit
+ mkdir -p /etc/ssh
+ /bin/mount --bind /host/etc/cockpit /etc/cockpit
+ /usr/sbin/remotectl certificate --ensure

# atomic images list --all
   REPOSITORY                               TAG      IMAGE ID       CREATED            VIRTUAL SIZE   TYPE
   registry.fedoraproject.org/f27/cockpit   latest   5a08300420e7   2018-03-29 14:25   446.28 MB      docker

# atomic run registry.fedoraproject.org/f27/cockpit:latest
/usr/bin/docker run -d --privileged --pid=host -v /:/host registry.fedoraproject.org/f27/cockpit:latest /container/atomic-run --local-ssh

This container uses privileged security switches:

INFO: --privileged
      This container runs without separation and should be considered the same as root on your system.

INFO: --pid=host
      Processes in this container can see and interact with all processes on the host and disables SELinux within the container.

For more information on these switches and their security implications, consult the manpage for 'docker run'.

bb05d895090aa844136fc814f74eb03e76b7b4151dfb2d5f340b7c0c8b250a57

# atomic containers list --all
   CONTAINER ID IMAGE                NAME       COMMAND    CREATED          STATE      BACKEND    RUNTIME
   bb05d895090a registry.fedoraproje competent_ /container 2018-03-29 18:04 running    docker     docker
# atomic stop bb05d895090a
# atomic containers list --all
   CONTAINER ID IMAGE                NAME       COMMAND    CREATED          STATE      BACKEND    RUNTIME
   bb05d895090a registry.fedoraproje competent_ /container 2018-03-29 18:04 exited     docker     docker
# atomic containers delete bb05d895090a
Do you wish to delete the following images?

   ID           NAME                 IMAGE_NAME                STORAGE
   bb05d895090a competent_kilby      registry.fedoraproject.or docker

Confirm (y/N) y

# atomic --debug uninstall registry.fedoraproject.org/f27/cockpit:latest
Namespace(_class=<class 'Atomic.uninstall.Uninstall'>, args=[], assumeyes=False, debug=True, display=False, force=False, func='uninstall', ignore=False, image='registry.fedoraproject.org/f27/cockpit:latest', nam
e=None, opt1=None, opt2=None, opt3=None, profile=False, storage=None)
/usr/bin/docker run --rm --privileged -v /:/host registry.fedoraproject.org/f27/cockpit:latest /container/atomic-uninstall
+ rm -f /host/etc/pam.d/cockpit
Unable to find 5a08300420e7ce64065c97baeea7f7500535b16b1d0abb7609ef0e7a2ee811d1 in the installed containers
Traceback (most recent call last):
  File "/usr/bin/atomic", line 185, in <module>
    sys.exit(_func())
  File "/usr/lib/python3.6/site-packages/Atomic/uninstall.py", line 62, in uninstall
    be.uninstall(img_obj, name=self.args.name, atomic=self, ignore=self.args.ignore)
  File "/usr/lib/python3.6/site-packages/Atomic/backends/_docker.py", line 486, in uninstall
    util.InstallData.delete_by_id(iobject.id, name, ignore=ignore)
  File "/usr/lib/python3.6/site-packages/Atomic/util.py", line 956, in delete_by_id
    raise ValueError("Unable to find {} in the installed containers".format(iid))
ValueError: Unable to find 5a08300420e7ce64065c97baeea7f7500535b16b1d0abb7609ef0e7a2ee811d1 in the installed containers

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/atomic", line 198, in <module>
    sys.exit(1)
SystemExit: 1

It looks like the uninstall script of the container was successfully run, but it blows up afterwards.

This was found using Fedora Atomic Host Continuous

The affected version was atomic-1.22.20-6820f8f3b53a0a9fe89690b671624905ce3d0d30.665bce33cf99ea73047097d433cf652b7f2d9adc.fc27.x86_64

Previous good version was atomic-1.22.17-56c32aeca382f2c68e5078679b7585837cb13716.665bce33cf99ea73047097d433cf652b7f2d9adc.fc27

The diffs between 56c32aeca and 6820f8f, lead me to think it was ba23e7692423dc66446f3c01e745598673ec2c4a that was the guilty commit.

miabbott commented 6 years ago

@giuseppe I think this was you; could you have a look?

giuseppe commented 6 years ago

@miabbott yes, that is a regression.

Could you check if https://github.com/projectatomic/atomic/pull/1219 solves the problem you have seen?

This is a quick fix to not regress on this, but I think the install data part needs some more work. Unfortunately it is quite difficult to handle correctly as the install.json file gets out of sync very easily and in many cases I had to use -i when uninstalling.

miabbott commented 6 years ago

@giuseppe That PR worked for me. I did some simple tests with a single container installed and with a mix of non-system/system containers.

We should probably have some more rigorous testing of the various possible scenarios with containers installed, etc.

giuseppe commented 6 years ago

@miabbott I've added a test case for the reported issue