cnabio / duffle

CNAB installer
https://duffle.sh
MIT License
376 stars 54 forks source link

Add `duffle rewrite` command #217

Closed technosophos closed 5 years ago

technosophos commented 6 years ago

This command is the first step toward an export/import workflow. I think we should implement it as a stand-alone because we don't necessarily always want to export when we rewrite.

Here's an example of what I imagine the command would look like:

$ duffle rewrite -f path/to/bundle.json technosophos.azurecr.io/foo

Here's an example source bundle.json:

{
    "name": "hellohelm",
    "version": "0.1.0",
    "invocationImage": {
        "imageType": "docker",
        "image": "cnab/lakebase:latest"
    },
    "images": [
        {
            "name": "alpine",
            "uri": "technosophos/demo2alpine:0.1.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.repository"
                }
            ]
        },
               {
            "name": "alpine",
            "uri": "quay.io/technosophos/something:0.2.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.deployment"
                }
            ]
        }
    ],
    "parameters": {},
    "credentials": {}
}

Okay, my interpretation of the original proposal is that running hte above command on the above bundle.json should do the following:

So the resulting bundle.json should be:

{
    "name": "hellohelm",
    "version": "0.1.0",
    "invocationImage": {
        "imageType": "docker",
        "image": "technosophos.azurecr.io/foo/lakebase:latest"
    },
    "images": [
        {
            "name": "alpine",
            "uri": "technosophos.azurecr.io/foo/demo2alpine:0.1.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.repository"
                }
            ]
        },
               {
            "name": "alpine",
            "uri": "technosophos.azurecr.io/foo/something:0.2.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.deployment"
                }
            ]
        }
    ],
    "parameters": {},
    "credentials": {}
}

And the resulting cnab/app/charts/alpine/values.yaml should be:

# Original stuff...

images:
  repository: technosophos.azurecr.io/foo/demo2alpine:0.1.0
  deployment: technosophos.azurecr.io/foo/something:0.2.0

# Other stuff from original values.yaml

We will need to decide in the future whose job it is to push the old images to the new location.

radu-matei commented 6 years ago

This looks like a great start - how about the tag name for this command? It follows the Docker analogy, plus it is a bit easier to write.

jeremyrickard commented 6 years ago

@technosophos when I read through this, I made the assumption that someone would have done a duffle pull before running this so they would have the CNAB and bundle.json locally. Is that valid?

technosophos commented 6 years ago

I like duffle tag instead of rewrite.

@jeremyrickard Yeah. You should be able to safely assume that:

If those are not true, it is fair to error out.

jeremyrickard commented 6 years ago

+1 to duffle tag

itowlson commented 6 years ago

duffle tag to me implies adding or changing a tag on the bundle, not modifying the bundle to point to a different repo. Subjective though I know...!

bacongobbler commented 6 years ago

To me, if I intend to re-tag a "thin" bundle via duffle tag, I expect the name and tag of the bundle to change, but not the parameters or underlying image tags. In other words, I only expect the name and version fields of the bundle.json to be changed and nothing else.

I envisioned duffle tag to be incredibly similar to docker tag:

docker build -t foo:latest
docker tag foo quay.io/bar:canary
docker push quay.io/bar:canary

The implication of the above use case is that I have not modified the underlying image layers (or the bundle, in our case), only the name and tag of the image (bundle).

That being said, we've been tossing around the concept of a "thick" bundle. A "thick" bundle is a bundle.json that defines the bundle and references the invocation images not by image name (technosophos/helloworld), but by the image layers that actually make up technosophos/helloworld.

Once a "thick" bundle has been re-tagged, duffle push can (and will) push the bundle along with all of its image layers in one go to the targeted registry, capturing about 90% of the same use case @jeremyrickard showed in standup. Once we actually go and implement the darned thing, we may want to consider making "thick bundles" the default output of duffle build to handle this use case. :)

The last bit @jeremyrickard demonstrated in standup today (re-tagging the refs field) would also likely go away if we switch to thick bundles by default.

We don't have a concrete example of a "thick" bundle to show yet, but I will update #215 with an example on how the bundle.json would look. Would love to pair with you on this one @jeremyrickard!

jeremyrickard commented 6 years ago

I envisioned duffle tag to be incredibly similar to docker tag:

@carolynvs and I were discussing this today and she had the same immediate thought I believe.

This morning I also tried to do an end-to-end sort of test with this and I think that how this issue was written (and how I implemented the command) will probably lead to some confusion for people.

Using hellohelm as an example:

It's bundle.json is currently:

{
    "name": "hellohelm",
    "version": "0.1.0",
    "invocationImage": {
        "imageType": "docker",
        "image": "cnab/hellohelm:latest"
    },
    "images": [
        {
            "name": "alpine",
            "uri": "technosophos/demo2alpine:0.1.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.repository"
                }
            ]
        }
    ]
    ....some stuff omitted...

If we use the command spec for this issue and runs something like duffle tag -f hellohelm/bundle.json jeremyrickard that would get re-written to:

{
    "name": "hellohelm",
    "version": "0.1.0",
    "invocationImage": {
        "imageType": "docker",
        "image": "jeremyrickard/hellohelm:latest"
    },
    "images": [
        {
            "name": "alpine",
            "uri": "jeremyrickard/demo2alpine:0.1.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.repository"
                }
            ]
        }
    ]
...stuff omitted

The values.yaml, however, is structured like:

image:
  pullPolicy: IfNotPresent
  repository: jeremyrickard/demo2alpine:0.1.0
  tag: 3.3
restartPolicy: Never

The difference here from the example given above is that we have the repository and tag attributes broken out, so running the tag command on this actually results in a broken end result. Doing the replace like that in the referenced things implies some structure that I'm not sure we've constrained. Rewriting the repository for the invocation image makes more sense, I think, but updating the references for something like this case seems not the best?

Maybe we can discuss at the standup or have a separate chat about this?

carolynvs commented 6 years ago

My immediate confusion was around the word "tag" because with docker, tag refers to the IMAGE:TAG format. I thought that I was going to be retag ponysparkles:abc123 to ponysparkles:v1.1, not just change the repository.

I may be missing the underlying use case for when someone would want to use this functionality. Naively, if I am only changing where it "lives", but not the underlying content, it almost sounds like a fork. Not that we need to introduce anymore confusing analogs though! 😀

jeremyrickard commented 6 years ago

OK! @bacongobbler and I had a little chat on this and I'll summarize below.

Point the first: There are probably two conceptual use cases here:

1.) "Rewriting" the image references in the bundle (i.e. invocation image and images array). We should probably not use "tag" here because semantically it isn't the same as the docker command.

2.) "Tag" a bundle, which would change it's name (note, i'm not suggesting it should work like below, just a notional example).

$ duffle search
NAME                REPOSITORY                  VERSION
hellohelm           github.com/deis/bundles.git 0.1.0
$ duffle tag hellohelm:0.1.0 jeremyhelm:1.0

This probably just updates the name and version tags in the bundle.

Point the 2nd: We think that rewrite makes sense for thin bundles, but probably not once we talk about thick bundles. While discussing this, we wondered if possibly thick bundles should be the default, with the related question: for the import/export case, should a thin bundle get converted to a thick bundle?

Point the 3rd: The rewrite command as a standalone command has two issues that we should consider:

So in summary I think the action items are:

1.) We should probably change this back to rewrite instead of tag 2.) We should create a tag command that just changes the name and version in the bundle 3.) We should determine if the rewrite command should push images after they've been rewritten and if this should only apply to thin bundles, and if so, maybe it doesn't need to exist at all after thick bundles are a thing?

@bacongobbler feel free to add any more around this!

cc: @technosophos @itowlson @carolynvs @radu-matei

bacongobbler commented 6 years ago

No, this summarized our discussion perfectly. Thanks @jeremyrickard for the thorough write-up!

To expand a little bit on one point:

for the import/export case, should a thin bundle get converted to a thick bundle?

From early discussions with @jeremyrickard, this definitely seemed like something we should consider. Imagine the following use case:

I, as a developer, want to export my thin bundle from my desktop, throw it on a USB stick, and import it on my laptop/cloud/IoT-enabled cat.

Now, imagine the process to export/import a bundle onto your IoT-enabled cat takes several days. When we export a thin bundle, we only have the image references, so we pull down the image layers and export it into the bundle. When we re-import it, we probably want to reference the layers placed on the USB stick rather than what's currently sitting in technosophos/helloworld, right? In the process it took to export/import, technosophos/helloworld may have been updated, which means that we'd miss out on the basic use case of export/import if we re-convert back to a thin bundle. In that case, we'd need to convert the manifest to a thick bundle to reference the local layers.

That also brings up another question: should we consider making thick bundles (once they're actually a thing and not some concept in our heads :smile:) the default output of duffle build? duffle rewrite may no longer be required if references to image layers stick with the bundle throughout its lifecycle. The rest of the infrastructure isn't ready to support thick bundles yet, but it does pose an interesting question in regards to the feasibility of this feature. It'd make duffle pull and duffle push much slower (because it needs to fetch all the layers), but duffle install et al would be significantly faster since all the layers would be on-disk, making for a speedy runtime experience.

Probably starting to veer a little off-topic from the current proposal (duffle rewrite/duffle tag), so please let me know if I should take this discussion elsewhere.

technosophos commented 6 years ago

Okay... wow... there is a ton to cover on this thread. Alright, I will try to articulate how this plays out in my head. then we can keep discussing.

In summary:

Phew... I need a donut and a coffee now (in that order).

jeremyrickard commented 6 years ago

@technosophos, wanted to follow up on this on hopefully a last point of clarification.

From your clarification above:

The replacement logic should be sophisticated enough that it can read an image field and parse it the way Docker does (e.g. can handle image, image:tag, org/image, org/image:tag, server/org/image, server/org/image:tag...)

And from the spec:

TODO: How do we specify multiple replacements within a single file?

If I have an input, like the hellohelm example:

image:
  pullPolicy: IfNotPresent
  repository: alpine
  tag: 3.3
restartPolicy: Never

What would that file look like if I run rewrite like:

duffle rewrite -f ~/cnab-bundles/hellohelm/bundle.json jeremyrickard

Currently, the bundle.json in that example is:

"images": [
        {
            "name": "alpine",
            "uri": "technosophos/demo2alpine:0.1.0",
            "refs": [
                {
                    "path": "cnab/app/charts/alpine/values.yaml",
                    "field": "image.repository"
                }
            ]
        }
    ],

With the bundle.json as-is, I'd expect to only replace image.repostory with something like jeremyrickard/demo2alpine, but the image.tag would be unchanged (and incorrect in this case).

If I add I add another ref to the same file, is that valid? i.e..

{
      "path": "cnab/app/charts/alpine/values.yaml",
      "field": "image.tag"
}

Assuming that's valid, it would get updated as well, but the value there should really be 0.1.0. How do we identify that correctly? AFAICT from reading the Docker image spec, 3.3 would be valid image name as well (experimentation shows it is).

docker build -t 3.3 .
Sending build context to Docker daemon   12.8kB
Step 1/3 : FROM cnab/k8sbase:latest
 ---> 0c40883b7dd7
Step 2/3 : COPY app/Makefile /cnab/app/Makefile
 ---> Using cache
 ---> 211940a28786
Step 3/3 : COPY app/charts /cnab/app/charts
 ---> Using cache
 ---> 425c9860d53c
Successfully built 425c9860d53c
Successfully tagged 3.3:latest

Or is that example not valid and the image.tag should reflect the value already in the original bundle.json (which was "uri": "technosophos/demo2alpine:0.1.0") and be set to 0.1.0?

Thanks!

technosophos commented 6 years ago

That is an excellent question. And this is going to be a huge problem if we don't do a good job of it.

What if we did something like allow an optional template field that had support for {{ .Repo }}, {{ .Image }}, and {{ .Tag }}? Do you think that would be flexible enough?

glyn commented 5 years ago

This issue corresponds to what we in the riff project call image relocation. For background, please see the design document Private Docker Registries and riff, the riff issues labelled with private-registry, and the corresponding PRs.

In the context of CNAB, we see two possible approaches to image relocation:

  1. Make image relocation the responsibility of an invocation image (probably packaged in a base invocation image and augmented with tooling). A parameter would specify the target registry and the invocation image would be responsible for tagging and pushing images to the target registry and rewriting kubernetes configuration files in the bundle to refer to the relocated image names. This approach puts the least burden on the CNAB spec and CNAB runtimes.

  2. Make image relocation the responsibility of the CNAB runtime and pass a map from original image name to relocated image name to the invocation image so that it can rewrite kubernetes configuration files in the bundle to refer to the relocated image names. This approach puts the least burden on invocation images and tooling.

Image relocation could apply to thick bundles (probably most suited to an air gapped installation) or thin bundles (where the user has a network zone with read access to the internet and read-write access to the target registry).

[The above information refers to open source versions of riff-distro and riff image commands, but these have been moved to closed source. See the PFS documentation for usage information.]

simonferquel commented 5 years ago

Please note that we already implemented that in cnab-to-oci (see https://github.com/docker/cnab-to-oci/blob/master/remotes/fixup.go#L18).

glyn commented 5 years ago

@simonferquel Thanks for that. I couldn't find the code to update refs to images during fixup. Is that supported?

glyn commented 5 years ago

https://github.com/deislabs/duffle/issues/668 aims to add image relocation support.

simonferquel commented 5 years ago

@glyn https://github.com/docker/cnab-to-oci/blob/master/remotes/fixup.go < this is where it happens (it does multiple things:

At the end of the process, the push logic converts the bundle into an OCI index representation and pushed it to the registry

The pull logic parses this OCI index into a duffle bundle structure with all image references being digested and pointing to the same repository

jeremyrickard commented 5 years ago

I believe this should be closed in support of image relocation should it not?