carvel-dev / imgpkg

Store application configuration files in Docker/OCI registries
https://carvel.dev/imgpkg
Apache License 2.0
255 stars 62 forks source link

Create a describe command for a bundle #124

Open jorgemoralespou opened 3 years ago

jorgemoralespou commented 3 years ago

Describe the problem/challenge you have As a user I want to know the contents of a bundle before pulling it completely down. Specifically what's really important is to know the images in the bundle (and dependent bundles) as when I copy (relocate) a whole bundle (and images) I would want to know what happened and why.

Describe the solution you'd like I would want a command like:

imgpkg describe -b my-bundle that will print the following details:

This way a user could do some diagnostics to determine the provenance of a bundle or to know what he'll need to plan (in terms of a CopyConfig file) to do proper image copying.

jorgemoralespou commented 3 years ago

Also related: https://github.com/vmware-tanzu/carvel-community/tree/develop/proposals/imgpkg/002-recursive-bundles#list-images-in-bundle

cari-lynn commented 3 years ago

Hey @jorgemoralespou to get more clarification on the why here, is the reason this command is needed because after relocating a bundle, you want to verify prior to downloading them that 1) all the images in the bundle now live in the destination registry and 2) If the images were renamed (following a strategy from the renaming proposal), you want to see that each image is named correctly and their most recent provenance?

Does this capture the why you need?

jorgemoralespou commented 3 years ago

Correct. That mostly summarizes the ask. Also, not only after a relocation, but I would want to get detailed information on a bundle's content. A bundle might have been transported/relocated from 0 to multiple times, and I want to get details on the bundle contents.

cari-lynn commented 3 years ago

Do you want to see details on the bundles contents in order to verify that all the files were copied successfully? Or is there another reason that this information is helpful? I'm trying to drill down exactly what use case we are addressing here.

Is seeing the file names and directory structure the kind of information that you need? For example something like this (copied from here):

.
├── .imgpkg
|   ├── bundles
|   │   ├── sha256-{SHA Of the First Nested Bundle}
|   │   │   ├── .imgpkg
|   │   │   │   ├── bundle.yml
|   │   │   │   └── images.yml
|   │   │   └── config2.yml
|   │   └── sha256-{SHA Of the Second Nested Bundle}
|   │       ├── .imgpkg
|   │       │   ├── bundle.yml
|   │       │   └── images.yml
|   │       └── config1.yml
│   ├── bundle.yml
│   └── images.yml
└── config.yml
jorgemoralespou commented 3 years ago

It's not the files (we assume the files are copied always ok, why wouldn't they?). It's the content of the files (e.g: content of images.yml). What can change in every copy operation.

joaopapereira commented 3 years ago

Just a clarification, the files inside the Bundles will never change, even in the copy process, or else we would have a different SHA between copies.

@jorgemoralespou let me know if this makes sense in terms of UX

$ imgpkg describe -b new.registry.io/simple-app-install-package

Bundle SHA: aaaaad700949154e429d28661d01c99d53a38af0d5275842ccbf0bf6dbef8ca4
Tags: latest, v1.0.0

Authors:
  Carvel Team <carvel@vmware.com>
Websites:
  carvel.dev/imgpkg
Metadata:
  - Some Version: 1.0.0
  - Other Information: Some text here

Copy Strategy: SingleRepository

Images:
  - new.registry.io/simple-app-install-package@sha256:d211dd700949154e429d28661d01c99d53a38af0d5275842ccbf0bf6dbef8ca4 (Bundle)
    Origin: my.registry.io/bundle1@sha256:d211dd700949154e429d28661d01c99d53a38af0d5275842ccbf0bf6dbef8ca4
    Images:
      - new.registry.io/simple-app-install-package@sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0
        Origin: registry.io/img1@sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0
        Annotations:
          kbld.carvel.dev/id: my.registry.io/simple-application

      - new.registry.io/simple-app-install-package@sha256:47ae428a887c41ba0aedf87d560eb305a8aa522ffb80ac1c96a37b16df038e0f
        Origin: registry.io/img2@sha256:47ae428a887c41ba0aedf87d560eb305a8aa522ffb80ac1c96a37b16df038e0f
  - new.registry.io/simple-app-install-package@sha256:47ae428a887c41ba0aedf87d560eb305a8aa522ffb80ac1c96a37b16df038e0f
    Origin: registry.io/img2@sha256:47ae428a887c41ba0aedf87d560eb305a8aa522ffb80ac1c96a37b16df038e0f
jorgemoralespou commented 3 years ago

@joaopapereira I think that even if the files don't change, @cari-lynn might have a point in that showing the files (structure) might be good as a describe command, specifically since it shows nested bundles, or it can show other things like that there's overlays in the bundle, etc... (e.g: a package bundle)

Maybe, there should be two different outputs based on flags, to not output too much information by default, but being able to see what the user wants to see.

Does this make sense?

joaopapereira commented 3 years ago

I think it makes sense to have this information in some way flag-dependent, something like --show-images, --show-fs-layout, --show-all.

Nevertheless, I would like to first try to understand what, in your opinion, is the Most Valuable information that we should surface from the start. After that, we can evolve this command to have more information as we have more user feedback for it.

jorgemoralespou commented 3 years ago

I think that:

That really doesn't bring down the list much :-D

pivotaljohn commented 3 years ago

@jorgemoralespou — it can really help to drill down to the why, here. Often a solid clue comes from how the information gets used.

As a user I want to know the contents of a bundle ... before pulling it completely down.

We haven't yet explicitly stated it, here, but I assume this is because it's expensive (in time, in rate limit, in bandwidth) to just pull down the bundle. Are these important sources of waste that this feature would mitigate?

Specifically what's really important is to know the images in the bundle (and dependent bundles) as when I copy (relocate) a whole bundle (and images) I would want to know what happened and why.

As we've teased out in this issue, so far, there are two primary pieces of information the user wants to get at:

  1. the contents of the bundle, and
  2. the purported origins of the bundle

Listing of the Contents of a Bundle

I think I can name some use-cases that show "what's next" with the contents:

@jorgemoralespou can you edit these and/or add other situations where having the listing of contents of the bundle satisfies some task?

Origins of a Bundle

The second item has me a little concerned. What question am I trying to answer by looking at a set of copy receipts?

If I'm trying to prove that a Bundle (and/or its contents) originated from a particular source, then this is a dangerous way to go about it.

There are only two things I know of that can help establish this fact:

  1. a signature verifiable to an authority
  2. a digest matching the same as from an authority

It's my understanding that everything else can be spoofed... and if someone is actually asking the question, I suspect they want confidence in the answer.

I feel like I'm likely missing something here.

@jorgemoralespou can you help name the task(s) where having copy receipts in hand is needed? and of those, that the receipts are of sufficient validity to be useful to the task?

jorgemoralespou commented 3 years ago

I want to check to see if the bundle I'm eyeing is the right bundle... so I want to see what's in it to verify that for myself... without having to download the whole thing.

This one is perfect as this was the use case I had in mind.

I've got a bunch of bundles and I want to select the one that contains at least some set of images I know I want... so I'm poking around the contents of various bundles to find the one that has that set.

This one not sure how valuable it is. Not sure yet if I would need to poke various bundles for one with a set of images I want.

What question am I trying to answer by looking at a set of copy receipts?

For this one, I don't really mean proving provenance as the fact that needs to be signed, but when an image is relocated and depending on the strategy selected (based on the imgpkg copy spec in progress), I would want to know the original names of the images (fully qualified) in case I need to open a support ticket or similar, or I need to look at docs. The main reason is that the FQN of the images in the deployment descriptors might be totally different from those in the docs/original descriptors and hence difficult to triage/report problems.

etirta commented 7 months ago

Hi,

Correct me if I'm wrong, but I thought the describe command should display the metadata as written in imgpkg/bundle.yml?

I put some data:

$ cat .imgpkg/bundle.yml 
metadata:
  oslManifestURL: ...

before I do imgpkg push -b ..., but it's not appearing when I do imgpkg describe -b ...:

$ imgpkg describe -b ... --output-type yaml
sha: sha256:bb59ca320f34063174bc9c00bc6f5b795d223c3b9b965851038ce85ebe4eb1d2
content:
...
image: ...
metadata: {}
origin: ...

Succeeded

The code seems to never populate this, just put an empty map as place holder.

Can this please be rectified, so I can retrieve this information vie impkg describe -b ...?

If I do imgpkg pull -b ..., the retrieved .imgpkg/bundle.yml has the information, but I don't want to have to pull it as the bundle can be huge, it's just a waste of time and local storage to pull the whole bundle just for me to get the .imgpkg/bundle.yml.

Thx.

praveenrewar commented 6 months ago

@etirta I think you are right, do you mind creating a new issue (bug) for this?