hashicorp / go-getter

Package for downloading things from a string URL using a variety of protocols.
Mozilla Public License 2.0
1.66k stars 238 forks source link

Support for symlink when untarring an archive #60

Open jcua opened 7 years ago

jcua commented 7 years ago

Operating System: Ubuntu 16.04

Issue: When a tarball has symlink inside it, go-getter turns the symlink into zero-length file. One benefit of having go-getter support this would be with respect to Nomad specifically with the artifact stanza https://www.nomadproject.io/docs/job-specification/artifact.html, since some tarball could contain symlink in it.

Repro:

  1. Create a test_dir.tar.gz with the following contents (note: test_ls.symlink)

    vagrant@nomad-server01:/tmp$ tree test_dir/
    test_dir/
    ├── a.txt
    └── test_ls.symlink -> /bin/ls
  2. On the same directory, execute: python -m SimpleHTTPServer 8000

  3. Run go-getter http://127.0.0.1:8000/test_dir.tar.gz ~/

  4. Check what go-getter untarred (note: test_ls.symlink is 0 bytes)

    vagrant@nomad-server01:~$ ls -l ~/test_dir/
    total 4
    -rw-r--r-- 1 vagrant vagrant 4 May  2 01:19 a.txt
    -rwxr-xr-x 1 vagrant vagrant 0 May  2 01:19 test_ls.symlink
tonyarkles commented 6 years ago

Was just burned by this through Nomad.

brmzkw commented 6 years ago

@tonyarkles same here. Since I'm using the raw_exec driver, I used tar as a workaround to extract the archive. I also extract the archive outside nomad's directory.

    task "mytask" {
      driver = "raw_exec"

      artifact {
        # https://github.com/hashicorp/go-getter/issues/60
        source = "https://url/to/archive/superapi-2.0.0-76-x86_64-xenial.tar.gz"
        options {
          archive = "false"
        }
      }

      config {
        command = "/bin/sh"
        args = [
          "-c",
          "
            mkdir -p /opt/subdir;
            tar -C /opt/subdir/ -xvf local/superapi-2.0.0-76-x86_64-xenial.tar.gz;
            /opt/ocs/superapi/bin/superapi
          "
        ]
      }
   }
prologic commented 5 years ago

This is breaking us in production :/ Is there an ETA for a fix?

johnzhanghua commented 5 years ago

https://github.com/hashicorp/go-getter/pull/37

Looks there is a PR for that, haven't got in.

prologic commented 5 years ago

👍 nice what’s the blocker on merging?

On Fri, 22 Feb 2019 at 18:15, johnzhanghua notifications@github.com wrote:

37 https://github.com/hashicorp/go-getter/pull/37

Looks there is a PR for that, haven't got in.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashicorp/go-getter/issues/60#issuecomment-466312124, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOv-gWqDIYxPH1oRBDUe-Y2w1LQsldhks5vP6cggaJpZM4NOOn7 .

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

prologic commented 5 years ago

As a temporary fix we may be able to build our own Nomad until this PR gets merged

On Fri, 22 Feb 2019 at 18:34, James Mills prologic@shortcircuit.net.au wrote:

👍 nice what’s the blocker on merging?

On Fri, 22 Feb 2019 at 18:15, johnzhanghua notifications@github.com wrote:

37 https://github.com/hashicorp/go-getter/pull/37

Looks there is a PR for that, haven't got in.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashicorp/go-getter/issues/60#issuecomment-466312124, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOv-gWqDIYxPH1oRBDUe-Y2w1LQsldhks5vP6cggaJpZM4NOOn7 .

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

prologic commented 5 years ago

I might also take this PR and re-factor into a new one because it’s over two years old and the original comments were not addressed

On Fri, 22 Feb 2019 at 18:39, johnzhanghua notifications@github.com wrote:

Yes. Looks we need to do that. Can't wait. I will have a go for that tomorrow. With the symbol link fix. You do the packaging extraction.

Thanks, John

On Fri, 22 Feb 2019 at 6:36 pm, James Mills notifications@github.com wrote:

As a temporary fix we may be able to build our own Nomad until this PR gets merged

On Fri, 22 Feb 2019 at 18:34, James Mills prologic@shortcircuit.net.au wrote:

👍 nice what’s the blocker on merging?

On Fri, 22 Feb 2019 at 18:15, johnzhanghua notifications@github.com wrote:

37 https://github.com/hashicorp/go-getter/pull/37

Looks there is a PR for that, haven't got in.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/hashicorp/go-getter/issues/60#issuecomment-466312124 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/ABOv-gWqDIYxPH1oRBDUe-Y2w1LQsldhks5vP6cggaJpZM4NOOn7

.

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/hashicorp/go-getter/issues/60#issuecomment-466317704 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AQxeL-zXh6kM0d6Mnflsm2Zt1P3bG1dRks5vP6wKgaJpZM4NOOn7

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashicorp/go-getter/issues/60#issuecomment-466318609, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOv-uzj-0pg4CKqmVkAv06jzZtyPhKMks5vP6zMgaJpZM4NOOn7 .

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

prologic commented 5 years ago

Before I do; I'd love it if someone from Hashicorp commented on this with some more recent context/update.

azr commented 5 years ago

Hey @prologic ! I would love to review such a PR, I don't have much context to add here and also nothing to add on top of the original code review. We just have to make sure it works on all OSes and unit (ci) tests are green on all platforms 🙂.

prologic commented 5 years ago

👍

On Mon, 25 Feb 2019 at 18:08, Adrien Delorme notifications@github.com wrote:

Hey @prologic https://github.com/prologic ! I would love to review such a PR, I don't have much context to add here and also nothing to add on to of the original code review. We just have to make sure it works on all OSes and unit tests are ran on all platforms 🙂.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/hashicorp/go-getter/issues/60#issuecomment-466909981, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOv-sXz2rAF27FDTYB9atLkvRpTnwv5ks5vQ5nogaJpZM4NOOn7 .

--

James Mills / prologic

E: prologic@shortcircuit.net.au W: prologic.shortcircuit.net.au

schmichael commented 5 years ago

Reopening due to #174 reverting #171. go-getter should implement the same path traversal protections as GNU tar for symlinks. See #175 for details.

prologic commented 5 years ago

I needed to support extracting symlinks in Tar archives myself internally at our company with the package manager we're writing to support Nomad. I thought I may as well (at the same time) rethink/resolve what was attempted here and came across the securejoin library. It appears ti on inspection of the code and testing solve exactly what we need to address symlink security issues that were present in #171

I'd love it if someone could also confirm the validity of using this library which itself is slated to go into the Go standard library. (not point reinventing the wheel on this one!) If happy I can resubmit a slightly improved version of #171 with added tests for verifications that we don't escape the chroot directory being extracted to in any way.

schmichael commented 5 years ago

While it's difficult to determine whether or not it's safe to trust a 3rd party library, the list of projects using SecureJoin gives me quite a bit of confidence (helm, opencontainers, jessfraz, etc). The reasons given in #171 to reject it from the stdlib don't seem to apply to go-getter's use case. If I can attempt to summarize them:

tl;dr - Yes! This should work! I think as long as you include some symlink root path escape tests we can accept tar symlink implementations using this library.

prologic commented 5 years ago

I actually tried to look for the C source to GNU Tar (I found the source) but I wasn't able to find the code that protects against chroot escapeing :/ (I only looked in obvious `.c` files) -- For me re-implementing this logic is hard and tricky* to get right; so using this library from someone that clearly knows what they're doing here sounds a lot better than re-inventing the wheel :)

prologic commented 5 years ago

@schmichael PR #129 is a new implementation as we discussed in comments above and ensures we don't escape the path we're extracting in to.