carvel-dev / vendir

Easy way to vendor portions of git repos, github releases, helm charts, docker image contents, etc. declaratively
https://carvel.dev/vendir
Apache License 2.0
287 stars 52 forks source link

Add option to unpack archive downloaded when using "http". #150

Open GrahamDumpleton opened 2 years ago

GrahamDumpleton commented 2 years ago

Describe the problem/challenge you have

When downloading archive file using http I need the archive file to be automatically unpacked.

Describe the solution you'd like

The githubRelease option has unpackArchive for something similar although in that case you need to designate the file to unpack. In the case of http, there is only the one asset file so could get away with a single boolean flag to indicate whether download should be unpacked.

apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
- contents:
  path: src/wrapt
  - http:
      url: https://github.com/GrahamDumpleton/wrapt/archive/refs/heads/develop.tar.gz
      unpackArchive: true
    path: .

Anything else you would like to add:

I am assuming that once unpacked you can still use newRootPath so as to indicate you want the final result to only include files which existed in a subdirectory of the archive.

For example, in the archive above the top level directory it unpacks into is wrapt-develop. I am hoping one can do:

apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
- contents:
  path: src/wrapt
  - http:
      url: https://github.com/GrahamDumpleton/wrapt/archive/refs/heads/develop.tar.gz
      unpackArchive: true
    path: .
  newRootPath: wrapt-develop

so that top level wrapt-develop directory in the archive is eliminated and you get just the files in it.


Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible" 👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

GrahamDumpleton commented 2 years ago

Note, it would be required that when unpacking the archive that the option exists to preserve execute bits.

Of note, when unpacking an archive from a githubRelease this doesn't happen, although there is a similar requirement for it as well to allow execute bits to be preserved.

joe-kimmel-vmw commented 2 years ago

Thanks @GrahamDumpleton ! it sounds like preserving the execute mode bits could present some stumbling blocks but we'll leave that to the initial implementation exploration.

knechtionscoding commented 1 year ago

@GrahamDumpleton Am I missing something with http? I think this is already supported?

    # fetches asset over HTTP (optional)
    http:
      # asset URL (required)
      url: 
      # verification checksum (optional)
      sha256: ""
      # specifies name of a secret with basic auth details;
      # secret may include 'username', 'password' keys (optional)
      secretRef:
        # (required)
        name: my-http-auth
      # skip unpacking tar, tgz, and zip files; by default files are unpacked (optional)
      disableUnpack: false

from: https://carvel.dev/vendir/docs/v0.32.0/vendir-spec/

Specifically:

      # skip unpacking tar, tgz, and zip files; by default files are unpacked (optional)
      disableUnpack: false
knechtionscoding commented 1 year ago

It works with zip files. I think the problem is it doesn't work with .tar.gz files?

GrahamDumpleton commented 1 year ago

I suspect unpacking for http didn't exist back then and was since added but this issue was never updated. Or perhaps disableUnpack didn't exist and so wasn't obvious that things were unpacked at all. Either way, it does need to handle *.tar.gz if it doesn't already.

knechtionscoding commented 1 year ago

Understandable, just wanted to make you aware if you still needed it. I'll look and see if I can figure out why it doesn't support .tar.gz files. And maybe submit a PR for it.

reegnz commented 1 year ago

If it helps anyone, this is where the logic is: https://github.com/carvel-dev/vendir/blob/bc7a968800ebec21b2dd0d3782e359fdbed0e0ca/pkg/vendir/fetch/archive.go#L147-L160

And I get this error for my 'tar.gz' file:

2023/08/08 17:13:16 Unknown file 'pax_global_header' (103)

That is tar.TypeXGlobalHeader, which I guess is safe to skip.

I guess the problem is that the custom logic looking into the tar.gz file is not aware there are multiple formats, like USTAR, PAX and GNU.

Looking at some golang internal code: https://pkg.go.dev/golang.org/x/build/internal/untar

reegnz commented 1 year ago

While looking into it I realized that the fix is so trivial that I opened a PR.

praveenrewar commented 1 year ago

Thank you so much for looking into this @reegnz and creating a PR. Do you always get the same error while unpacking tar.gz files or this happens only in certain cases? Also, do you mind creating a new issue with the details, as @knechtionscoding rightly pointed out, the feature to unpack archive files has been there for a long time and it works by default, so we can close this issue.

GrahamDumpleton commented 1 year ago

If this issue is going to be closed, then a new issue also needs to be created for the fact that execute permissions are not preserved when archives are unpacked. This was also mentioned above in this issue.

This preservation of execute permissions is also lacking with archives unpacked from GitHub releases. To round things out, there should also be a way to set execute permissions when provided files using inline. Preservation of execute permissions was added for OCI images, but it needs to be done for the other cases as well to be consistent and avoid hacks to work around the limitation. Vendir is used for more than just YAML files these days.

praveenrewar commented 1 year ago

Thank you for pointing this out @GrahamDumpleton 🙇🏻 it had almost skipped my mind 😅 Would you like to create a separate issue for preserving execute permissions? (Or let me know and I can create one based on the info you have provided)

reegnz commented 1 year ago

@praveenrewar opened a separate issue for the fix and linked the PR to that one. I'm strictly focused on the extraction issue, the permissions issue I'll let someone else pick up instead.

praveenrewar commented 1 year ago

Thank you so much @reegnz!