adnichols / kitchen-docker-api

Docker driver for test-kitchen using a ruby based docker client
Other
11 stars 2 forks source link

Test fail can leave kitchen in bad state, requiring .kitchen to be hand-deleted. #5

Open NathanZook opened 10 years ago

NathanZook commented 10 years ago

This was observed following the observation of the condition reported in #4.

Macintosh:wp_docker_deploy-cookbook nzook$ .bin/kitchen test
-----> Starting Kitchen (v1.2.1)
-----> Cleaning up any prior instances of <default-ubuntu-1404>
-----> Destroying <default-ubuntu-1404>...
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: Failed to complete #destroy action: [Expected(200..204) <=> Actual(404 Not Found)]
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration
adnichols commented 10 years ago

I've observed this before but couldn't find a good preventative approach. The problem that I've seen is that the container disappears outside of kitchen (either exits for some reason or is destroyed) and so the destroy fails I believe. My only thought was to create a handler for the 404 case & try to cleanup when we see that. Do you have a consistent failure case that makes this reproducible?

NathanZook commented 10 years ago

Try interrupting a run, then stop & rm the docker image, then try to do another run. As I understand the problem it is that the state file inside .kitchen says the image is there when it is not--you could just throw in a junk id.

I assert that if you attempt a delete of an image or container that is no longer present, that the error message that the daemon returns is in fact a success indicator. I know that docker's return objects have been...idiosyncratic. A 404 SHOULD be equivalent to the container not being there, but I would recommend doing an inspect to ensure that this is the case.

That is: 1) attempt to stop the container. 2) if 1) succeeds, attempt to remove the container. 3) If docker errors for either of the first two steps, either check the error object for an explicit statement that the container is gone, or attempt to inspect the object. If it is gone, then we are fine. If it is present, a delay & retry is probably in order.