cloudfoundry / guardian

containers4life
Apache License 2.0
76 stars 42 forks source link

Containers that are partially created can be fetched for cleanup #371

Closed mariash closed 1 year ago

mariash commented 1 year ago

In case if fail to create container (e.g. external networker is down) and we fail to immediately destroy it in the defer clause, we want to be able to find that container later and try to destroy it. So if later external networker comes up we can clean up containers and not leak them.

To achieve that, we modified the Containers() call to be able to take in the garden.state property. It used to be always set to created which only returned successfully created containers. When Rep tried to get containers for clean up it was not getting containers that were partially created because they didn't have the created state. So now we can pass in the value all which will indicate to garden that we want containers in all states. When all is passed in we remove garden.state from the filter. This is done for backwards compatibility. By default Containers() call will return created containers.

We also saving container properties before we create container and its dependencies (volumes, network, bind mounts). So that in case of failure we can later get them by properties and retry clean up. When Rep creates containers it sets the owner property as executor and later when it cleans up it only pulls containers with that property. This is done in case when multiple services use garden to create containers, each of them might have different owner.

@reneighbor

cf-gitbot commented 1 year ago

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.