Open neild opened 1 year ago
Bikeshed: instead of NewReaderWithOptions
, call the function Open
. This corresponds to the function os.Open
which opens a file for read-only access.
What happened with the errors being introduced in Go 1.20? I thought the plan was to turn them on by default in some newer release?
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
We stated that we might turn on name security checks, but the pervasiveness of Docker tar files with absolute paths makes me dubious that we will ever be able to do so without causing widespread breakage.
Adding a new safe-by-default function could give us a path to making that change: First we add the safe function under a different name, then we deprecate the unsafe functions, and when a sufficient portion of the ecosystem has moved off of the unsafe functions we might be able to change them to be safe by default and even undeprecate them.
Open
is an interesting choice of name. Unfortunately, archive/zip
has an OpenReader
function which opens a file rather than an io.ReaderAt
; an Open
function would be confusing there. I overlooked OpenReader
when writing this proposal; we should also have a safe variant of it as well, such as OpenReaderWithOptions
.
This still feels like it's getting very complicated. One way it can be hard to write secure code is when there are too many ways to do it, and it feels like we're headed down that road. Can we simplify this?
How many different programs or libraries read "Docker tar files"? If those update to recognize the new errors, that problem is fixed, right?
I would expect NewReader
and NewReaderWithOptions(ReaderOptions{})
to be the same, but that's not the case since the former is insecure, while the latter is secure.
Adding a new safe-by-default function...
Safe-by-default, but also one that's just much harder to call than NewReader
. Even if we rename NewReaderWithOptions
to just Open
, it's still harder to call without #12854. That's quite unfortunate. I'm personally in favor of changing NewReader
and directing callers depending on absolute paths to ignore ErrInsecurePath
.
Also, I personally find Open
a confusing name for this since it doesn't automatically open a file from the FS.
This still feels like it's getting very complicated. One way it can be hard to write secure code is when there are too many ways to do it, and it feels like we're headed down that road. Can we simplify this?
How many different programs or libraries read "Docker tar files"? If those update to recognize the new errors, that problem is fixed, right?
We could simplify this by adding a safe-by-default API with no options, or with fewer options. For example:
package tar
func SafeReader(r io.Reader) *Reader
The alternative is to change the existing readers as proposed in #55356. I don't know how many programs would need to be updated to handle this change. Docker is pretty popular, though. I'll try running Google-internal tests (which include a fair bit of third-party package coverage) with GODEBUG=tarinsecurepath=0,zipinsecurepath=0 and report back on how many things break. (We did this in the runup to 1.20, but I don't have the failure list to hand any more.)
hi hi
We could add NewSafeReader, but I don't think we've exhausted trying #55356 (default on) yet. We should probably do that first or else understand why we can't ever do that. The Google test failures looked like they had a small number of root causes. Docker may be popular but I don't think direct reading of Docker tar files is nearly as popular.
This is a noble goal but it feels as though the proposal is trying to fix the issue in the wrong place. A path in an archive isn't inherently secure or insecure, so why is the reader concerning itself with making that decision and returning an error? It feels a little like trying to prevent SQL injection by making http.Request.PostFormValue() aware of 'dangerous' SQL characters.
Would it make more sense to add API to handle extracting an archive to a filesystem directly? That would remove a lot of the "boilerplate" you need to write at present, and provide a sensible place for the "secure or not" determination to live where it can make OS-specific decisions.
subtopic#1: Wrt docker tarballs, another option could be sanitizing the filename in the header (turning absolute into relative), when on GODEBUG env var or a ReaderOptions option indicate it. When entries are extracted to disk, integrations are supposed to call path.Join anyway, so this should be transparent. (Btw, docker calls chroot before extracting anything)
subtopic#2: Another attack vector currently not covered is via symlinks. Consider a crafted tar archive with the following two entries:
$ tar tvf crafted.tar
lrwxrwxrwx imrer/primarygroup 0 2023-02-23 08:17 something -> /
-rw-r----- imrer/primarygroup 10 2023-02-23 08:18 something/root/.bashrc
I think legitimate archives with symlink entries pointing out of the archive (but without a path traversal follow up entry) are more frequent out there than archives with absolute path entries. Doing this securely would require one of these (all of them have some drawbacks):
subtopic#3: I tried finding info about when support for variadic functions was added into golang, but didn't find anything authoritative. If it was supported from the very beginnings, the existing entry points could be expanded with something like this:
type ArchiveOptions int
const (
RejectRelativePathComponents ArchiveOptions = 0
AllowRelativePathComponents ArchiveOptions = 1
...
)
type Reader struct{}
func NewReader(r io.ReaderAt, size int64, options ...ArchiveOptions) (*Reader, error) {
if len(options) == 0 {
// adjust the default (e.g. activate the Reject options by default eventually)
}
...
}
This would keep the number of entrypoints the same and may make things a bit less complicated (while still offering full control).
This is a noble goal but it feels as though the proposal is trying to fix the issue in the wrong place. A path in an archive isn't inherently secure or insecure, so why is the reader concerning itself with making that decision and returning an error?
As a practical concern, validating that archive paths don't escape the current directory would have avoided at least five distinct CVEs that I know of, and probably others that I don't. (See #55356.) Addressing the issue in the archive reader will reduce the chances of vulnerabilities in all code that reads archives; adding a new API to extract an archive to a filesystem risks satisfying only a subset of possible use cases and leaving other users vulnerable.
func NewReader(r io.ReaderAt, size int64, options ...ArchiveOptions) (*Reader, error) {
This is unfortunately not a backwards compatible change, since existing code may rely on the current function signature. For example, code might assign NewReader to a variable.
The Google test failures looked like they had a small number of root causes.
I'll report back when I manage to get a test run to go through. (The number of failures caused timeouts in the failure-deflaking tool. Might still be a small number of root causes, though.)
Waiting for more information about whether we can enable #55356 by default.
Putting on hold until we have that more information. Please move back to active when we do.
Placed on hold. — rsc for the proposal review group
This is another attempt at hardening the
archive/tar
andarchive/zip
packages against file traversal attacks, fixing #25849.Naive handling of archives containing filenames such as
../../../etc/passwd
leads to security vulnerabilities. (See #55356 for a bunch of reference CVEs.) In #55356, I proposed changingarchive/tar
andarchive/zip
to return anErrInsecurePath
error when encountering an unsafe filename. Unfortunately, we subsequently discovered that Docker images frequently include filenames with absolute paths, making this change infeasible.I propose adding new safe-by-default functions to
archive/tar
andarchive/zip
.The
NewReaderWithOptions
functions will provide the same functionality as proposed in #55356, but in a new API to avoid breaking existing users. In addition, these functions provide an easy way for users who want to accept some forms of unsafe path to do so while still defending against unexpected cases: For example, permitting absolute filenames while rejecting ones containing unexpected backslash characters.I further propose that we update the
NewReader
documentation to encourage all users to migrate toNewReaderWithOptions
. Once all supported Go versions includeNewReaderWithOptions
, we can go further and deprecateNewReader
. We might also consider changingNewReader
to be safe by default at some point, although that decision is out of scope for this proposal.