containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.22k stars 2.37k forks source link

podman save: (multi) tag handling #649

Closed vrothberg closed 6 years ago

vrothberg commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

Description

This issue describes two issues related to how podman-save deals with tags. I did not want to split it into two issues as it may cause a discussion about how libpod should behave with respect to tags and also if it should behave as Docker does.

(1) Using image ID adds a tag to the manifest.json

The manifest.json file in example.tar will include a tag of the image. Docker behaves differently as it will set "RepoTags":null, which is something I would expect from libpod/Podman as well.

(2) Using multiple images fails

Example:

# podman save -o foo.tar docker.io/library/alpine:latest docker.io/library/foo:bar docker.io/library/foo:123
Getting image source signatures
Copying blob sha256:cd7100a72410606589a54b932cabd804a17f9ae5b42a1882bd56d263e02b6215
 4.20 MB / 4.20 MB [========================================================] 0s
Copying config sha256:3fd9065eaf02feaf94d68376da52541925650b81698c53c6824d92ff63f98353
 1.48 KB / 1.48 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures
unable to save "docker.io/library/foo:bar": Error copying image to the remote destination: Error initializing destination docker-archive:foo.tar:docker.io/library/foo:bar: docker-archive doesn't support modifying existing images

Notice that foo:bar and foo:123 are just tags for alpine:latest, so they all refer to the same image. In this case, Docker would create an archive with all three tags set in manifest.json.

Output of podman version:

Version:       0.4.2
Go Version:    go1.9.4
OS/Arch:       linux/amd64
rhatdan commented 6 years ago

@umohnani8 WDYT

rhatdan commented 6 years ago

@mtrmac PTAL

mtrmac commented 6 years ago

(Related: https://github.com/containers/image/issues/447 )

Using image ID adds a tag to the manifest.json

Do you mean a hexadecimal ID? Looking at https://github.com/projectatomic/libpod/blob/master/cmd/podman/save.go#L113, the argument should be passed unmodified to archiveTransport.ParseReference, which should reject such input completely. Am I overlooking something?

If you mean a name, it seems to me rather useful to include a name:tag in the archive, otherwise after docker load the image is inaccessible unless the user somehow out-of-band passes the image ID. (OTOH, yes, it is annoying that one can’t docker load an archive without allowing the archive to override all tags, IIRC). I don’t feel strongly about this.

(2) Using multiple images fails

Yes, that’s awkward to do in the c/image design, which works one image at a time and does not really have a concept of “repositories” (apart from PolicyConfigurationNamespaces), let alone open/WIP/finished repositories.

If you are willing to pay the performance cost, it would of course be possible to unpack the archive, add more images into it, and repack it, or something like that. At O(avg_image_size^2) I/O operations that’s fairly costly—but as long as only people who really want this need to pay that cost, why not.

mtrmac commented 6 years ago

If you are willing to pay the performance cost, it would of course be possible to unpack the archive, add more images into it, and repack it

Or, well, make the functionality available only to a knowledgeable caller willing to go beyond the “image at a time” abstraction (which could be podman save): like we already have the “unpacked directory structure” oci: transport, provide a similar transport for a docker save-formatted-unpacked archive; the caller would have to knowingly initialize the directory structure, and knowingly convert it into a tarball.

vrothberg commented 6 years ago

Do you mean a hexadecimal ID? Looking at https://github.com/projectatomic/libpod/blob/master/cmd/podman/save.go#L113, the argument should be passed unmodified to archiveTransport.ParseReference, which should reject such input completely. Am I overlooking something?

@mtrmac Yes, I used the image ID provided by podman-images. I had a look a the code and also ended up in ParseReference, which is not rejecting IDs (see idOrDigest) but it seems to always compute a name for the returned image. In this special case, I think we could skip the name computation. However, I am not familiar with potential side-effects.

mtrmac commented 6 years ago

I had a look a the code and also ended up in ParseReference, which is not rejecting IDs (see idOrDigest)

That’s the source side, in storageTransport.ParseReference? I am looking at archiveTransport.ParseReferencereference.ParseNormalizedNamedanchoredIdentifierRegexp.MatchString. Also

$ skopeo --insecure-policy copy docker://busybox docker-archive:foo:4de1899892c07c1e16b0f67dc228b6fa151e799c2529fa17ce69a0997635be77
FATA[0000] Invalid destination name docker-archive:foo:4de1899892c07c1e16b0f67dc228b6fa151e799c2529fa17ce69a0997635be77: docker-archive parsing reference: invalid repository name (4de1899892c07c1e16b0f67dc228b6fa151e799c2529fa17ce69a0997635be77), cannot specify 64-byte hexadecimal strings 

if using a hexadecimal ID as a tag works for you, I must be overlooking something basic in the podman code.

mtrmac commented 6 years ago

Apart from the above, me not seeing how any of this works right now — I can’t immediately see why a name:tag would have to be mandatory in the created archive; the code requires it now, but it seems easy to make it optional, if there really are valuable uses for that.

The only rationale for making it mandatory that I can find now is https://github.com/containers/image/blob/e51d350816912bbd09a91f690cd394b6225c043d/docker/daemon/daemon_transport.go#L46 .

vrothberg commented 6 years ago

if using a hexadecimal ID as a tag works for you, I must be overlooking something basic in the podman code.

In case of podman-save, we jump pretty soon to storageTransport.ParseStoreReference, and I don't see the one from archiveTransport (which would fail parsing as you've mentioned). We're calling it here: https://github.com/projectatomic/libpod/blob/master/libpod/image/image.go#L318

Apart from the above, me not seeing how any of this works right now — I can’t immediately see why a name:tag would have to be mandatory in the created archive; the code requires it now, but it seems easy to make it optional, if there really are valuable uses for that.

I agree. Given there are no unexpected side-effects, we could special case if the specified reference string is an ID (see https://github.com/containers/image/blob/e51d350816912bbd09a91f690cd394b6225c043d/storage/storage_transport.go#L164) and skip the tagging.

mtrmac commented 6 years ago

In case of podman-save, we jump pretty soon to storageTransport.ParseStoreReference, and I don't see the one from archiveTransport (which would fail parsing as you've mentioned). We're calling it here: https://github.com/projectatomic/libpod/blob/master/libpod/image/image.go#L318

Sure, that’s from https://github.com/projectatomic/libpod/blob/master/cmd/podman/save.go#L116 .

But I’m looking at https://github.com/projectatomic/libpod/blob/master/cmd/podman/save.go#L114https://github.com/projectatomic/libpod/blob/master/libpod/image/image.go#L418 , which should AFAICS end up as archiveTransport.ParseReference. (After all, some input must be passed to docker/archive.ParseReference, because that’s the only way to get a docker-archive: destination to write to.)

Given there are no unexpected side-effects, we could special case if the specified reference string is an ID (see https://github.com/containers/image/blob/e51d350816912bbd09a91f690cd394b6225c043d/storage/storage_transport.go#L164) and skip the tagging.

I can’t see how the tagging, or not tagging, in containers/storage, affects the reference used when writing to docker-archive:. https://github.com/projectatomic/libpod/blob/master/cmd/podman/save.go#L110 decides to base both the source and destination on the same string input, but the destination is not computed from the source.

Maybe it’s time for me to actually install podman :)

umohnani8 commented 6 years ago

From what I can see, podman save fails if a hexadecimal is used as a tag:

➜  libpod git:(master) ✗ sudo podman save -o alp.tar alpine:3fd9065eaf02feaf94d68376da52541925650b81698c53c6824d92ff63f98353
unable to find 'alpine:3fd9065eaf02feaf94d68376da52541925650b81698c53c6824d92ff63f98353' in local storage: unable to find a name and tag match for alpine in repotags

But just giving the image ID instead works fine:

➜  libpod git:(master) sudo podman save -o alp.tar 3fd9065eaf02f                                                   
Getting image source signatures
Copying blob sha256:cd7100a72410606589a54b932cabd804a17f9ae5b42a1882bd56d263e02b6215
 4.20 MB / 4.20 MB [========================================================] 0s
Copying config sha256:3fd9065eaf02feaf94d68376da52541925650b81698c53c6824d92ff63f98353
 1.48 KB / 1.48 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures

However the id is saved as the name of the image in the repo tags, which I agree is a bug.

I agree to change it such that repoTags is empty if an image ID is used for save. When loading the image back, we can generate a hexadecimal for it and save it as "none". @mtrmac WDYT?

mtrmac commented 6 years ago

➜ libpod git:(master) sudo podman save -o alp.tar 3fd9065eaf02f

Ah, so that works because the supplied ID is not full 64 characters. That’s what I’ve been missing. Thanks!

However the id is saved as the name of the image in the repo tags, which I agree is a bug.

Yes… and awkward, when abcde can be both an ID prefix and a name.


When loading the image back, we can generate a hexadecimal for it and save it as "none".

There is already code to handle unnamed sources in Runtime.getPullListFromRef. Is that sufficient? (If you mean using the literal name "none" for the image, that seems surprising to me. Can we keep the image unnamed, reachable only using the ID, instead?)

umohnani8 commented 6 years ago

@mtrmac I meant leave the image unnamed. When an image is unnamed it shows up as "none" when you do podman images (sorry should have phrased that better earlier). Yeah, I know we already handle the pulling part for unnamed sources as I came across images when doing load that did not have a Repo Tag. Just need to modify the save part to not save the id as the name. As for the save multiple tags to one archive, is that something we really want to do even though it is not great performance wise as you pointed out?

mtrmac commented 6 years ago

As for the save multiple tags to one archive, is that something we really want to do even though it is not great performance wise as you pointed out?

Per https://github.com/containers/image/issues/447#issuecomment-383074842 , it seems not necessary right now. Maybe podman could expose the capability in https://github.com/containers/image/issues/447 to use multiple tags for a single image.

umohnani8 commented 6 years ago

Fixed in #819

Asgoret commented 5 years ago

Still here. I tried save 43 images at once. Save only one image and stopped. $podman save > okd.tar {REGISTRY}/okd/coreos/configmap-reload:v0.0.1 {REGISTRY}/okd/openshift/origin-node:v3.11.0 ..... Podman info

host:
  BuildahVersion: 1.5-dev
  Conmon:
    package: podman-0.11.1.1-3.git594495d.el7.centos.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.12.0-dev, commit: d61f1d176b2b01a5e46efbf236427ece745b92f2-dirty'
  Distribution:
    distribution: '"centos"'
    version: "7"
  MemFree: 145330176
  MemTotal: 8201146368
  OCIRuntime:
    package: containerd.io-1.2.2-3.el7.x86_64
    path: /usr/sbin/runc
    version: |-
      runc version 1.0.0-rc6+dev
      commit: 96ec2177ae841256168fcf76954f7177af9446eb
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 4
  hostname: okd-master1
  kernel: 3.10.0-957.1.3.el7.x86_64
  os: linux
  rootless: false
  uptime: 4h 52m 2.73s (Approximately 0.17 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
store:
  ContainerStore:
    number: 212
  GraphDriverName: overlay
  GraphOptions:
  - overlay.override_kernel_check=true
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
  ImageStore:
    number: 43
  RunRoot: /var/run/containers/storage
mheon commented 5 years ago

We added a test case in #819 which should have caught any regressions here, so this is strange @umohnani8 PTAL

umohnani8 commented 5 years ago

@Asgoret I just tried it locally with about 10 tags and it worked fine for me. What is your podman version podman version?

Asgoret commented 5 years ago

@umohnani8 It's dev stend and destroyed every time when tests end, i can't get any information from there( I can say, that we didn't use any specific repos, only base for centos7 and epel.

mtrmac commented 5 years ago

Still here. I tried save 43 images at once. Save only one image and stopped. $podman save > okd.tar {REGISTRY}/okd/coreos/configmap-reload:v0.0.1 {REGISTRY}/okd/openshift/origin-node:v3.11.0 .....

@Asgoret Are those 43 different tags for the same image, or 43 different images? All of this issue, and #819, only supports different tags for a single image.

Asgoret commented 5 years ago

@mtrmac different images, tags and repo. For example: image #1: docker.io/openshift/node:v3.11.0 image #2: quau.io/coreos/prom:wed2dea image: #3: redhat.registry.io/etcd:1.25

umohnani8 commented 5 years ago

@Asgoret that is not going to work, we only have support for the same image but different tags. for example: docker.io/library/alpine:3.1, docker.io/library/alpine:3.4 etc.

mtrmac commented 5 years ago

To be explicit, it has to be the very same image; not a same repo with different tags pointing to different images (as would be common with different versions of the same product).

Asgoret commented 5 years ago

@umohnani8 @mtrmac ok...i understand my mistake. Thanks for helping!