dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.29k stars 4.74k forks source link

Tar: symbolic link targets outside extract dir #74140

Closed tmds closed 2 years ago

tmds commented 2 years ago

.NET doesn't allow to extract symbolic links that have a target outside the extraction folder.

This should not be disallowed.

Instead it should be disallowed to extract in a location that is a link.

An archive like this should extract successfully:

link: SymbolicLink to /tmp

And an archive like this should fail when trying to extract link/file:

link: SymbolicLink to /tmp
link/file: RegularFile

cc @carlossanlop @dotnet/area-system-io

dotnet-issue-labeler[bot] commented 2 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.

Issue Details
.NET doesn't allow to extract symbolic links that have a target outside the extraction folder. This should not be disallowed. Instead it should be disallowed to extract in a location that is a link. An archive like this should extract successfully: ``` link: SymbolicLink to /tmp ``` And an archive like this should fail when trying to extract `link/file`: ``` link: SymbolicLink to /tmp link/file: RegularFile ``` cc @carlossanlop @dotnet/area-system-io
Author: tmds
Assignees: -
Labels: `area-System.IO`, `untriaged`
Milestone: -
tmds commented 2 years ago

@carlossanlop I can implement this if you want.

carlossanlop commented 2 years ago

@GrabYourPitchforks is this a scenario we would allow? Extracting a symlink entry that points to an absolute path that is outside of the extract directory, and whose middle segments are not symlinks.

carlossanlop commented 2 years ago

Talked to @GrabYourPitchforks offline: This is a scenario we want to explicitly avoid for security reasons. We don't want the archive extraction to unexpectedly add a symlink where it could cause trouble. This also has a workaround: The user has to iterate the archive, then read the Name and explicitly choose the destination for extraction. We don't want the TarFile helper methods to do it.

ghost commented 2 years ago

This issue has been marked needs-author-action and may be missing some important information.

tmds commented 2 years ago

@GrabYourPitchforks @carlossanlop because tar allows, I think .NET should too. Without this you can't backup and restore a directory which links with absolute paths.

jozkee commented 2 years ago

I agree with @tmds, there may be scenarios where you want to have symlinks pointing outside of the destination. Plus, symlinks are allowed to have anything on their target; they won't be resolved until required.

@GrabYourPitchforks @blowdart, could you please reconsider?

cc @bartonjs

blowdart commented 2 years ago

What you're describing is the zipslip vulnerability (except, well, in tar files). The workaround of iterating through the directly structure is there should you want it, but if we enable it by default lots of windows users won't understand the problem (because symlinks in Windows are just odd), and we'd have path escapes everywhere. So, I'm still thinking no on this.

bartonjs commented 2 years ago

I agree with @blowdart. If you want the insecure thing you have to write it yourself.

jozkee commented 2 years ago

In that case, we can close this as won't fix. cc @am11

bartonjs commented 2 years ago

In longer form:

While creating just the symlink doesn't sound like a threat, consider a system that accepts a tar payload of some sort, extracts it into a place, and runs some sort of validation on the contents. A symlink could now point outside the (semi-)sandbox causing the validator to do one of two unsurprising things:

1) Validate the payload, when it shouldn't have (e.g. by pointing validation_bypass to /var/whatever/current_validation_bypass_key) 2) Cause the validator to be a source of exfiltration.

Exfiltration isn't terribly unlikely, after all, the system "knows" that it's reading user data, so it can send back contents without concern.

If one of "service developers" and "backup/restore software developers" have to be categorically smarter than the other, I'd say it's the backup/restore developers (there are fewer of them, and they're specialized). So if one of them needs to do more work than the other to get the more correct set of behaviors for their need, it should be the backup/restore developers.

tmds commented 2 years ago

If this were a known vulnerability, or a major source of security issues, it would not be the default behavior of tar.

bartonjs commented 2 years ago

That's not really true. The tar CLI tool is designed for being operated by humans who have the capability to use a giant inference engine to decide something isn't copacetic using dynamic sources of information (applying context like "where did this file come from?"). The TarFile API is designed for being operated by a computer, which only takes into context the things we thought of ahead of time; and we're designing it to assume that the input is untrusted.

A human might use tar to extract a thing, run ls, see that manifest is colorized differently than other things, and move into an exception flow. The API doesn't get that possibility of running ls to see that the manifest is "wrong" unless we build that in ahead of time.

A CLI tool and an API don't have the same basic set of assumptions.

Could we add a knob here? Perhaps. Can it be a default behavior? No, unless we can change File.Open to not read through symlinks (which would be disastrous).

tmds commented 2 years ago

ok, I see your point: once the files are extracted, the user may be unaware he's using links because they get de-referenced automatically.

I agree it makes sense to disallow links to point outside the extract dir by default.

tmds commented 2 years ago

Also, I think a knob is desired. I think it's good the default behavior assumes the tar file may be evil, but if you have a trusted tar file, like a backup, you don't want to have to re-write the extraction logic.

jozkee commented 2 years ago

@tmds would you like to submit the API proposal issue?