Open neild opened 6 months ago
is this essentially https://pkg.go.dev/github.com/google/safeopen with Beneath -> In ?
that also has a ReadFile / WriteFile variant which I'd use more then the create version.
The design of this proposal is influenced by github.com/google/safeopen, but differs in a few areas. (Sorry, I really should have mentioned safeopen as prior art.)
Of the three parts of this proposal:
os.OpenIn
is essentially safeopen.OpenBeneath
.File.Open
is a slightly more limited but safer version of openat
, and has no equivalent in safeopen
.O_NOFOLLOW_ANY
has no equivalent in safeopen
.ReadFileIn
and WriteFileIn
seem like a useful and logical extension of this proposal.
Yes, please. When I was working on safe file operations and it turned out to be hard to do correctly without OS support.
Without O_NOFOLLOW
, you have to slowly check every segment for symlinks before traversing into it. For the naive implementation, how do you protect against TOCTOU bugs? At the moment that you check some path segment and verify that it's not a symlink (or a safe one) and then proceed to descend into it, some other process (or goroutine) could have asynchronously changed the target.
What, if any, changes would be made to "io/fs"? Ideally, there is a mirror of these APIs in that package.
If we wanted to extend this proposal to io.FS
, I believe the one addition would be:
package fs
// An OpenFile is a directory file whose entries may be opened with the Open method.
type OpenFile interface {
File
// Open opens the named file in the directory.
//
// When Open returns an error, it should be of type *PathError
// with the Op field set to "openat", the Path field set to name,
// and the Err field describing the problem.
//
// Open should reject attempts to open names that do not
// satisfy ValidPath(name), returning a *PathError with Err set to
// ErrInvalid or ErrNotExist.
Open(name string) (File, error)
}
A more interesting question is os.DirFS
. Currently, DirFS
has two documented limitations: It follows symlinks out of the directory tree, and if the FS root is a relative path then it will be affected by later Chdir
calls.
I don't think we can change DirFS
's symlink-following behavior: It's documented, and it's a behavior that a user could reasonably depend on.
The interaction between DirFS
and Chdir
seems less likely to be something a user would depend on, but it is documented. I'm not sure if we can change it at this point, but perhaps.
Perhaps we should add a version of DirFS
that opens the directory root at creation time (retaining a handle to it even if the current working directory changes or the root is renamed), and refuses to follow symlinks out of the root. I'm not sure if that should be part of this proposal or a separate one.
Perhaps the new names should include an At suffix to make clear to casual readers of the API that these are not the usual open system calls. Either way, the three new methods should probably reference some shared section of documentation on the concept of the at
-suffixed operations.
I presume you mean the new method names? The functions have an "In" suffix.
We could also include the In suffix on the methods; I waffled on whether it belongs there or not:
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
func (f *File) CreateIn(name string) (*File, error)
func (f *File) OpenIn(name string) (*File, error)
I'm trying to avoid the suffix "At" to make it clear that none of these calls are precisely openat
. openat(2)
permits escaping from the root directory via absolute or relative paths, and doesn't do anything about symlink traversal. (Linux has openat2(2)
, which is quite configurable. The proposed *In functions are essentially openat2
with the RESOLVE_BENEATH
flag.)
Fair enough. Should the fs.OpenFile.Open
method also be named OpenIn
?
Should the fs.OpenFile.Open method also be named OpenIn?
Probably, for consistency.
Are we missing RemoveIn?
We should probably have RemoveIn
as well:
// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error
// Remove removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error
Perhaps also RemoveAllIn
?
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. ā rsc for the proposal review group
Maybe it would be better if the parent was an fs.FS? Seems more widely applicable, if somewhat more complex.
func RemoveIn(parent, name string) error
Note that Windows does not provide (AFAIK) an unlinkat
counterpart. It will have to be emulated doing something like:
func RemoveIn(parent, name string) error
f, err := os.OpenIn(parent, name)
if err != nil {
return err
}
return syscall.SetFileInformationByHandle(f.Fd(), syscall.FileDispositionInfo, ...)
}
As a user, when should I use os.Open vs os.OpenIn? Should I continue to default to os.Open, and only use OpenIn when I am actively avoiding a security issue, or should my default be OpenIn now?
Maybe it would be better if the parent was an fs.FS?
The os
package file functions operate on the local filesystem. fs.FS
is an abstraction over a filesystem; it sits atop the os
package functions, not under them.
If we want to add support for OpenIn
on fs.FS
filesystems, we would want something like https://github.com/golang/go/issues/67002#issuecomment-2075223022. We could add that to this proposal if we want, but for now I'm keeping this proposal focused on the os
package.
Note that Windows does not provide (AFAIK) an unlinkat counterpart.
I think that's fine. This proposal requires varying degrees of implementation depending on platform already. (Linux has the very nice openat2 with RESOLVE_BENEATH, platforms without an equivalent are going to require us to do more work to produce equivalent behavior.)
If it's not possible to emulate unlinkat on Windows, that might be a problem, but it sounds like it should be possible.
As a user, when should I use os.Open vs os.OpenIn?
You should use OpenIn
when you want to open a file within a directory.
I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open
. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn
to ensure that the output doesn't escape from the destination directory.
The FS OpenFile looks good, yes. Somehow I skipped that comment, sorry.
A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.
I don't really understand this. What is a program supposed to do if File.Open
returns ErrUnsupported? Either it can give up and report an error to the user, meaning that the program simply doesn't work on plan9 or js, Or it can implement the fallback manually,
f, err := parent.Open(filename)
if err == os.ErrUnsupported {
f, err = os.OpenIn(parent.Name(), filename)
//or even: os.Open(path.Join(parent.Name(), filename))
}
which is exactly the "hazardous" behaviour you say you're trying to avoid. If the underlying platform truly has no equivalent to openat
, though, then there's no other reasonable fallback. Returning ErrUnsupported is just creating more work for developers for no tangible benefit.
I think this could be addressed perfectly well in the docs by saying that some platforms (linux, windows, etc) provide extra guarantees around renamed files, and that others (plan9 and js) do not.
If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback.
The question is whether this is a reasonable fallback or not.
In the case of os.OpenIn
, I think it's reasonable to fall back to a less-secure implementation. Lacking openat
, we can statically validate the untrusted filename component for unintended traversal (os.OpenIn(dir, "../escapes")
), and we can test for symlinks on the path, but we remain vulnerable to TOCTOU attacks. TOCTOU symlink attacks, where an attacker creates a symlink on the path while we're in the process of validating it, are an edge case and I think it's okay for us to support os.OpenIn
on platforms where we can't defend against them (plan9 and js).
In the case of os.File.OpenIn
, however, there are valid operations that we simply can't support without openat
or the equivalent. With openat
, you can open a directory, rename or even delete it, and then continue to access files in that directory. There's no way to simulate this with operations on the directory's filename.
Perhaps it's okay to say that os.File.OpenIn
behaves differently on plan9 and js, and that users who need the ability to follow a directory across renames/deletes are responsible for not trying to do so on those platforms. Returning an error is the more conservative choice.
I note also that if you don't need the openat
behavior of following a directory across renames, you don't need to use os.File.OpenIn
at all--you can just always use os.OpenIn
.
I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.
I would recommend adding guidance in the documentation of the not *In
variants calling out that the *In
variants exist and recommended for cases when directory escape is not desirable.
Updated proposal, with comments on various changes arising from above discussion and working on implementation.
The OpenFileIn
, CreateIn
, and OpenIn
functions are unchanged from the original proposal:
package os
// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)
// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)
// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)
The File
methods now all have an In
suffix: File.OpenFileIn
, File.CreateIn
, File.OpenIn
. This is clearer overall: For example, f.CreateIn
creates a file in the directory f
, it doesn't create f
. This also resolves an ambiguity between File.Stat
and File.StatIn
(see below).
package os
// OpenFileIn opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
// CreateIn creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) CreateIn(name string) (*File, error)
// OpenIn opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)
To the above, we add MkdirIn
, RemoveIn
, and StatIn
functions and methods. Creating directories and removing files are fundamental operations, and there's no reason to leave them out. DirFSIn
(see below) provides a traversal-resistant Stat
, so StatIn
is included here as well.
Open question: Should we add LstatIn
as well? How about SymlinkIn
? RenameIn
? ReadFileIn
and WriteFileIn
? On one hand, I don't want to let this proposal get out of hand with an endless array of new functions; on the other hand, some of these do seem useful. I'd appreciate proposal committee's thoughts on where we should draw the line with this proposal.
package os
// MkdirIn creates a new directory in the named parent directory
// with the specified name and permission bits (before umask).
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Mkdir].
func MkdirIn(parent, name string, perm FileMode) error
// MkdirIn creates a new directory in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) MkdirIn(name string, perm FileMode) error
// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error
// RemoveIn removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error
// StatIn returns a FileInfo describing the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Stat].
func StatIn(parent, name string) (FileInfo, error)
// StatIn returns a FileInfo describing the named file in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) StatIn(name string) (FileInfo, error)
We add os.DirFSIn
, a traversal-safe version of os.DirFS
.
Open question: DirFSIn
or DirInFS
? I prefer DirFSIn
--"a directory filesystem in (root)", but internal discussion suggested DirInFS
might be better.
For the moment, we do not add any new optional interfaces to io/fs
, such as fs.OpenFile
(see https://github.com/golang/go/issues/67002#issuecomment-2075223022).
There are many existing APIs, both in and out of the standard library, that operate on an io/fs.FS
. Providing a traversal-resistant FS
implementation is a simpler and more effective approach to hardening programs than requiring every API which operates on an FS
to check for and use an OpenFile
method.
Open question: It seems likely to me that we're going to want more variations on DirFS
in the future. For example, it seems reasonable to want an FS
that disallows symlink traversal entirely (essentially passing O_NOFOLLOW_ANY
to every file open). Therefore, I think DirFSIn
should either accept an options struct to allow for future customization, or should return a concrete type with customization methods. ( For example, fs := os.DirFSIn("root"); fs.SetFollowSymlinks(false)
). The following returns a concrete type.
package os
// DirFSIn returns a filesystem for the tree of files rooted at the directory dir.
// The directory dir must not be "".
//
// Open calls will resolve symbolic links, but return an error if any link points outside the directory dir.
//
// The returned filesystem implements [io/fs.FS], [io/fs.StatFS], [io/fs.ReadFileFS], and [io/fs.ReadDirFS].
func DirFSIn(dir string) *FS
type FS struct{}
func (fs *FS) Open(name string) (File, error)
func (fs *FS) Stat(name string) (FileInfo, error)
func (fs *FS) ReadFile(name string) ([]byte, error)
func (fs *FS) ReadDir(name string) ([]fs.DirEntry, error)
The O_NOFOLLOW_ANY
open flag remains unchanged.
Open question: Should we add os.O_NOFOLLOW
? I only realized while implementing this proposal that it doesn't exist already. (Existing code which uses the flag uses syscall.O_NOFOLLOW
.) On one hand, if we're supporting a portable O_NOFOLLOW_ANY
, perhaps we should support a portable O_NOFOLLOW
as well. On the other hand, O_NOFOLLOW
can be dangerously surprising, since it only prevents symlink resolution in the final filename component, so perhaps we should stick to the more robust form.
const (
// O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
// or [File.OpenFile], disallows resolution of symbolic links anywhere in the
// named file.
//
// O_NOFOLLOW_ANY affects the handling of symbolic links in all components
// of the filename. (In contrast, the O_NOFOLLOW flag supported by many
// platforms only affects resolution of the last path component.)
//
// O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
// parameter of [OpenFileIn].
//
// O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
// or Plan 9 bind mounts.
//
// On platforms which support symbolic links but do not provide a way to
// disable symbolic link traversal (GOOS=js), open functions return an error
// if O_NOFOLLOW_ANY is provided.
O_NOFOLLOW_ANY int = (some value)
)
Open question: How should we handle ..
relative path components in filenames?
Consider the following directory tree:
a/b
is a directory.s
is a symlink to a/b
.f
, a/f
, and a/b/f
are files.On the Unix command line, if we cat s/../f
, we print the contents of the file a/f
.
If we open the current directory and openat(curfd, "s/../f")
, we also open a/f
.
The safeopen
package cleans filenames prior to opening a file, so safeopen.OpenBeneath(".", "s/../f")
opens the file f
. The safeopen
package also forbids symlink traversal entirely, so safeopen.OpenBeneath(".", "s/f")
returns an error rather than opening a/b/f
.
On Windows, things are confusing (and I'm still trying to understand what's going on under the hood): Using NtCreateFile
to open a file in "." (the rough equivalent of Unix's openat
):
s/../f
opens a/f
.a/b/../f
is an error.It appears that NtCreateFile
will resolve ..
path components only if a symlink appears somewhere in the path. This is weird enough that I feel like I must be be missing something.
The question is: What should os.OpenIn(".", "s/../f")
do in this case? Options I see include:
s
and the relative path component ..
, and open a/f
. This matches Unix openat
behavior...
components, and open f
. This matches the safeopen
package's behavior. I don't like this option, as it defines a new set of nonstandard filesystem semantics (pathnames are lexically resolved prior to opening).s
.My current inclination is the first option above: Permit both symlinks and ..
path components, and resolve each step of the path in sequence. (So s/../f
opens a/f
in the above example.) This may be a bit tricky to implement on Windows, but it should be possible.
I can, however, see a good argument for disallowing .
and ..
relative path components. This simplifies the implementation, there are few if any real-world cases where resolving paths like s/../f
is necessary, and users can lexically clean paths with filepath.Clean
if desired.
Open question: How should we handle platforms without openat
or an equivalent, namely GOOS=plan9 and GOOS=js?
GOOS=js does not permit implementing OpenIn
in a fashion free of TOCTOU races (swapping a directory component with a symlink elsewhere on the filesystem). I believe Plan 9 doesn't have symlinks; if that's the case, TOCTOU races are not a concern on it.
GOOS=js and GOOS=plan9 do not permit implementing File.OpenIn
correctly. Opening a directory as f
, renaming or deleting that directory, and then using f.OpenIn
should act on the original directory. Without openat
or an equivalent, we have no way to follow the directory handle and the best we can do is act on the original directory path.
I've argued above for supporting OpenIn
on these platforms and not supporting File.OpenIn
. I think that I've been convinced by arguments above that it's better to support as much of the API as possible, even if platform limitations prevent supporting all of it. I therefore propose that on js and plan9, f.OpenIn("path")
behaves equivalently to os.OpenIn(f.Name(), "path")
.
This is rather extensive API. Perhaps a separate package from os would be better? Maybe os/in?
On a very minor note, Plan 9 can be considered to implement O_NOFOLLOW_ANY because there are no symlinks on Plan 9 at all.
More generally, I understand the motivation here, but the amount of new API is a bit daunting. I think we need to keep thinking about reducing the total amount of API. It seems like there needs to be some type representing the constrained file system. For this message, let's call it a Dir. It would be defined like:
// A Dir represents a root directory in the file system.
// Methods on a Dir can only access files and directories inside that root directory.
// Methods on Dir are safe to be used from multiple goroutines simultaneously.
// After Close is called, methods on Dir return errors.
type Dir struct {
...
}
func OpenDir(name string) (*Dir, error)
func (*Dir) FS() fs.FS
func (*Dir) OpenFile
func (*Dir) Create
func (*Dir) Open
func (*Dir) OpenDir
func (*Dir) Mkdir
func (*Dir) Remove
func (*Dir) MkdirAll
func (*Dir) RemoveAll
func (*Dir) Close
All the top-level convenience things like os.OpenIn can be left out. Code can use OpenDir followed by the operation it wants.
That at least feels like a more manageable amount of API.
I have been thinking for a while and have not come up with a name like more than Dir. It's certainly not perfect, and OpenDir would need a doc comment explaining that it's not opendir(3), but it's not bad.
@rsc With this approach it would be also nice to define some globals, like CWD
.
Zig also has a simmilar abstraction in the std, also called Dir
https://ziglang.org/documentation/master/std/#std.fs.Dir.
Also
// Methods on Dir are safe to be used from multiple goroutines simultaneously.
Is this true for Close
?
@mateusz834 Sure, Close can be called from multiple goroutines simultaneously. Same thing is true of os.File.
Re CWD, I disagree with having a global whose meaning changes each time the current directory changes. What is supposed to happen if I do this?
dir := os.CWD
os.Chdir("/tmp")
dir.Open("foo")
Neither answer is good.
If code wants the current current directory [sic], then it can use os.OpenDir("."). Then at least it is clearer what happens with Chdir.
I vote for the name Root
rather than Dir
.
For the record, other possible methods:
Chmod
Chown
Chtimes
Lchown
Lstat
Readlink
Rename
Symlink
Truncate
Those are the top-level os package functions that take file names that can't otherwise be handled with complete safety. (There is also os.Stat
and os.Readfile
but those can be done with File
methods.)
@rsc I think that there is a special AT_FDCWD
fd for that.
man openat(2):
If pathname is relative and dirfd is the special value AT_FDCWD, then pathname is interpreted relative to the current working directory of the calling process (like open(2)).
If pathname is absolute, then dirfd is ignored.
I'm fine with dropping the top-level functions, although I note that "open this file within this directory, where the directory name is trusted and the file name is not" is a very common operation. Perhaps we could limit the top-level functions to CreateIn
and OpenIn
, for the most common operations?
What's the motivation for a new Dir
type as opposed to adding additional methods to File
? The number of methods is the same either way, and we've already got methods on File
which only apply to directories (ReadDir
, Readdir
, Readdirnames
). Plus io/fs.ReadDirFile
defines a directory as a specialization of a file.
If we did want a new Dir
(or Root
) type, would we want a way to convert a File
into a Dir
? (f.Dir()
, maybe?) And would we want a Dir.ReadDir
method for listing the contents of a directory?
The motivation is to put all the "safety" stuff on one type instead of intermixing it with "non-safety" stuff. The names end up cleaner and the code using the APIs should be easier to vet. I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File. They should be separate concepts.
Put another way, although a Dir is an fd underneath and a File is an fd underneath, that doesn't mean the two are conceptually the same. The Dir type is a safety abstraction, and the fact that it is very like a File in its representation is an implementation detail.
I am sure that other helpers will be wanted on Dir too, including ReadDir, ReadFile, WriteFile. All the more reason for keeping Dir separate from File.
@ianlancetaylor I thought about Root as well but Root has the problem that you'd expect os.Root to be /. Or maybe uid 0. It's already overloaded. And RootedDir seems too long. It's interesting that Zig ended up on a type of the same name, although clicking around on https://ziglang.org/documentation/master/std/#std.fs.Dir I couldn't find anything like a doc comment explaining what concept Dir represents, if any.
How about os.Rooted
?
"Rooted" has another common meaning too. š
I'm pretty convinced about the API choice of separating all these routines into methods on one type, but I think we need to keep thinking about the name of that type.
I can't think of a better name than Dir
. Everything else is too overloaded. (FS
is nice, but then that conflicts with io/fs.FS
.)
I'm not entirely convinced that it's better to have two types rather than adding more methods to File
, given that we already have extensive precedent for treating a File
as a directory handle. I think there's going to be user confusion about when you want a File
and when you want a Dir
.
But perhaps it's fine.
openat(2)
allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH
, or maybe this should be configurable? This way the os.Dir
might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).
RE CWD Zig has a CWD https://ziglang.org/documentation/master/std/#std.fs.cwd, in Zig you have to have a Dir to open a file. openat with AT_FDCWD works the same way as open does, so:
dir := os.CWD
dir.Open("foo")
os.Chdir("/tmp")
dir.Open("foo")
has the same behavior as:
os.Open("foo")
os.Chdir("/tmp")
os.Open("foo")
openat(2) allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH, or maybe this should be configurable? This way the os.Dir might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).
I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.
I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.
Yes, of course, but this might influence the API, at this point we have two choices:
OpenFile(resolve ResolveOptions, file string)
os.Dir
struct, so you have to create a new Dir
to change the resolve options.If we don't consider that from the start, then we are only left with the second one.
Options should be in the Dir struct. You configure it, and then you use it with whatever methods you want. Want two different kinds of restrictions? Configure two Dir structs.
Otherwise you need all the kinds of options on all kinds of different methods. The whole point of the Dir struct is to hold all that complexity and leave you with the same functions (methods) and signatures as the top-level os funcs.
openat2 was added in kernel 5.6, is is possible to support lower kernel versions? #67001
Linux is the only platform with openat2, so we need an openat-based implementation anyway. Linux versions without openat2 can use it, or we can just use openat everywhere if detecting support for openat2 is difficult.
I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File.
This only works if you know ahead-of-time if the file you're opening is a directory. I have code that opens a path that may or may not be a directory, and calls File.Stat
to decide whether to Read
or ReadDir
it. If it's a directory, I'd like to be able to convert the File
to a Dir
and use the new, safer, functions to open its children.
(Unless Dir
were allowed to represent non-directories and Open(".")
were special-cased, which I think would be quite unintuitive.)
The Dir type assumes you're rooted at a file descriptor, but the original design called for rooting at a string:
func OpenIn(parent, name string) (*File, error)
Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors? If not, then perhaps we shouldn't constrain the API design based solely on how it would be implemented on Linux. The idea of rooting based on a string path seemed appealing to me.
Never mind, OpenDir(string).Open(string) is basically the same thing. :)
Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors?
Every Unix-like platform we support has openat, including macOS. Some of them have extensions which make implementation simpler (but then we still need to support the lowest common denominator, so it's not actually any simpler), but basic openat is enough.
Plan 9 doesn't have openat, but Plan 9 doesn't have symlinks so lexical path analysis suffices.
Windows has NtCreateFile, which can act quite a bit like openat. I'm going to write up a longer description of Windows path behavior at some point, but the quick version is that it's possible to build these APIs (in any of the versions proposed so far) on Windows as well.
GOOS=js does not have openat or any equivalent (as best I can ascertain), and we can therefore only support a degraded version of these APIs on that platform.
As you've noted, OpenIn is essentially a convenience version of OpenDir(parent).Open(file). I think it's okay for us to leave it out to keep the API surface down, but most of the real-world path traversal vulnerabilities I've observed would have been fixed by changing an os.Open(filepath.Join(parent, file)) call to os.OpenIn(parent, file). Making the safe path as simple as possible seems worthwhile to me.
Let's try Root. Then we can have:
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenDir
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Close
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Symlink
func (*Root) Truncate
And then there's an obvious name for the convenience function:
func OpenRooted(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}
Does that look right?
We may want to drop Root.Symlink or else we have to say what it rejects (like absolute paths). (See the awful cases in #49580 if we go down that path.)
LGTM. I'm partway through implementing (using the name Dir), and haven't run into any significant concerns yet.
Root.Symlink seems fine; it creates a symlink in the root with the given target. The contents of the target are up to the caller. We can be clear in the documentation that Root.Symlink isn't trying to impose any restrictions on the link target.
I do not have a strong opinion on Dir vs. Root as a name. Dir is less overloaded, Root raises fewer questions about why Files can also be directories.
Please see the updated proposal in https://github.com/golang/go/issues/67002#issuecomment-2368976062
Directory traversal vulnerabilities are a common class of vulnerability, in which an attacker tricks a program into opening a file that it did not intend. These attacks often take the form of providing a relative pathname such as
"../../../etc/passwd"
, which results in access outside an intended location. CVE-2024-3400 is a recent, real-world example of directory traversal leading to an actively exploited remote code execution vulnerability.A related, but less commonly exploited, class of vulnerability involves unintended symlink traversal, in which an attacker creates a symbolic link in the filesystem and manipulates the target into following it.
I propose adding several new functions to the os package to aid in safely opening files with untrusted filename components and defending against symlink traversal.
It is very common for programs to open a file in a known location using an untrusted filename. Programs can avoid directory traversal attacks by first validating the filename with a function like
filepath.IsLocal
. Defending against symlink traversal is harder.I propose adding functions to open a file in a location:
The
OpenFileIn
,OpenIn
, andCreateIn
family of functions safely open a file within a given location, defending against directory traversal, symlinks to unexpected locations, and unexpected access to Windows device files.All modern Unix systems that I know of provide an
openat
call, to open a file relative to an existing directory handle (FD). Windows provides an equivalent (NtCreateFile
withObjectAttributes
including aRootDirectory
). Of the supported Go ports, I believe only js and plan9 do not supportopenat
or an equivalent.I propose adding support for
openat
-like behavior toos.File
:Like the top-level
CreateIn
,OpenIn
, andOpenFileIn
, the methods defend against accessing files outside the given directory. This is unlike the default behavior ofopenat
, which permits absolute paths, relative paths outside the root, and symlink traversal outside the root. (It corresponds to Linux'sopenat2
with theRESOLVE_BENEATH
flag.)A property of
openat
is that it follows a file across renames: If you open a directory, rename the directory, and useopenat
on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't haveopenat
or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such thatf.OpenIn(x)
is equivalent toos.OpenIn(f.Name(), x)
. However, this seems potentially hazardous. I propose, therefore, thatFile.CreateIn
,File.OpenIn
, andFile.OpenFileIn
return anerrors.ErrUnsupported
error on these platforms.The above functions defend against symlink traversal that leads outside of the designated root directory. Some users may wish to defend against symlink traversal entirely. Many modern operating systems provide an easy way to disable symlink following: Linux has
RESOLVE_NO_SYMLINKS
, Darwin hasO_NOFOLLOW_ANY
, and some other platforms have equivalents.I propose adding support for disabling symlink traversal to the
os
package:O_NOFOLLOW_ANY
may be passed toOpenFile
,OpenFIleIn
, orFile.OpenFIle
to disable symlink traversal in any component of the file name. ForOpenFileIn
, symlinks would still be permitted in the directory component.On platforms which do not support the equivalent of
O_NOFOLLOW_ANY
/RESOLVE_NO_SYMLINKS
natively, theos
package will use successiveopenat
calls withO_NOFOLLOW
to emulate it. On platforms with noopenat
(plan9 and js), open operations will return an error whenO_NOFOLLOW_ANY
is specified.