docker_image can't pull new images if it's in use

hashicorp / terraform-provider-docker

As part of our introduction to self-service publishing in the Terraform Registry, this copy of the provider has been archived, and ownership has been transferred to active maintainers in the community. Please see the new location on the Terraform Registry: https://registry.terraform.io/providers/kreuzwerker/docker/latest

https://registry.terraform.io/providers/kreuzwerker/docker/latest

Mozilla Public License 2.0

132 stars 92 forks source link

docker_image can't pull new images if it's in use #226

Open xanderflood opened 5 years ago

xanderflood commented 5 years ago

The Docker provider includes a docker_image resource that, in principle, can be used to pull an image from a docker repository for user in a container or service. However, when a new image is pushed to the tag, the docker_image resource forces a delete and re-create. In most cases, the image cannot be deleted because it is in use by containers or services, so the build will fail.

One way to remedy this would be by modifying it or introducing a different resource which represents a tag instead of an individual image, and which can maintain more than one image simultaneously. It might look something like this:

resource "docker_tag" "postgres_96" {
  name = "postgres:9.6

  // computed fields
  // latest = 3a9b8eca
  // previous = 8ff57c4f
}

When a new image is appears, the one pointed to by previous is deleted, previous is set to latest, and latest is set to the new digest. This way, an in-use image won't be deleted until at least two applies have elapsed, the first one guaranteeing that the image is no longer in use.

xanderflood commented 5 years ago

This is clearly intended behavior so I'm viewing this as a feature request, but I feel it could just as easily be considered a bug since it's hard for me to imagine a non-trivial use case for docker_image as currently implemented.

xanderflood commented 5 years ago

An alternative approach could be to add a pull_triggers field to the docker_service and docker_container and let those resources manage their own images. Generally I like the separate docker_tag resource because it avoids a situation where many services/containers would believe that they're in charge of managing the lifecycle of the same image

xanderflood commented 5 years ago

Here's a proposed schema for comment:

func resourceDockerTag() *schema.Resource {
    return &schema.Resource{
        Create: resourceDockerTagCreate,
        Read:   resourceDockerTagRead,
        Update: resourceDockerTagUpdate,
        Delete: resourceDockerTagDelete,

        Schema: map[string]*schema.Schema{
            "name": {
                Type:     schema.TypeString,
                Required: true,
            },

            "latest": {
                Type:        schema.TypeString,
                Computed:    true,
                Description: "The sha256 digest of the latest image in this tag.",
            },
            "previous": {
                Type:        schema.TypeString,
                Computed:    true,
                Description: "If set, the sha256 digest of the second-to-latest image in this tag. This attribute is intended to internal resource management and should generally not be referenced externally, as it may not always be set.",
            },

            "keep_locally": {
                Type:     schema.TypeBool,
                Optional: true,
            },

            //TODO
            // should this resource is pull automatically?
            // is there a use case for this?
            "pull_triggers": {
                Type:     schema.TypeSet,
                Optional: true,
                ForceNew: true,
                Elem:     &schema.Schema{Type: schema.TypeString},
                Set:      schema.HashString,
            },
        },
    }
}

My sense is that pull_triggers may not be necessary here - I'm imagining that the resource would always check for new versions and that there's no use case for manual triggers, but I'm not sure if I'm missing a situation where that could be useful.

xanderflood commented 5 years ago

Maybe keep_locally is unnecessary also? If you don't need images to be cleaned up then docker_image will be suitable, so I'm not sure if there's a need to offer that functionality on docker_tag

xanderflood commented 5 years ago

One problem here is that the Update callback will only have access to the desired state, which in practice will consist of the new latest value (pulled by Read) and the old previous value. The old latest value (which we want to store in previous after updating so it can be deleted later) will not be available. The previous value of previous (ugh) would still be there since the Read operation has nowhere to read that from, so it should just write back the old value.

Potentially this can be remedied by something like:

            "latest": {
                Type:        schema.TypeString,
                Computed:    true,
            },
            "all": {
                Type:        schema.TypeList,
                Computed:    true,
                Elem:        &schema.Resource{ Type: schema.ValueString },
            },

The all field would be redundant in that it includes latest, so that all the needed information is available on update.

In summary:

Create will be basically identical to docker_image's create hook, except that it also sets all = [latest]
Read will check for new images. If one is available, it sets latest. It always re-sets all back to its existing value to avoid a spurious diff.
Update uses latest to pull a new image. If successful, it deletes all images in all except the first one (which is the old value of latest) - in general there should only be two elements in all, so only one image gets deleted. If it succeeds with that, it sets all = [latest, all[0]].
Delete deletes all the images listed in all (again, there should always be two).

xanderflood commented 5 years ago

Two notes:

First of all, it seems like pull_triggers or something analogous is necessary. I didn't realize that changes to a computed field on Read won't trigger an update, so if the goal is for an automatically-updated image, TF requires that a "user-provided signal" trigger the update.

Secondly, there's another slightly different use case that could be covered with a slight change. Suppose a user wants to point to successively versioned images (myorg/mymicroservice:v1.1, then later myorg/mymicroservice:v1.2). Since name forces new, this runs into the same issue as using docker_image. The user would have to create a separate docker_tag resource, migrate all containers and services, then delete the old resource in a separate apply.

Suppose we make two changes to the above design: (1) name no longer has ForceNew, and (2) all contains a list of fully-qualified names (repo:tag@sha256:digest) instead of just digests. This way the user can seamlessly migrate from one tag to another and the resource will still manage cleaning up old images and without doing it too soon.

Any thoughts on this? In this case I'd propose choosing a different name from docker_tag since it no longer represents a single tag during it's lifecycle - maybe something like docker_image_manager? Also note that this won't work very well if you're trying to maintain several containers running different versions of the same image.

xanderflood commented 4 years ago

This PR contains a suggested implementation (it's based off of the edge branch that I'm using but the diff will show an accurate reflection of the changes I made). I refactored the docker_image resource and docker_registry_image data source a little to reuse code, but I shouldn't have changed any functionality there. I've been using it personally for a bit and feel that it fits my needs - i.e., pulling new images from a "name", cleaning up after itself, but giving you the ability to not deleting an image until after all the resources using it have migrated off.