golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.01k stars 17.67k forks source link

os: safer file open functions #67002

Open neild opened 6 months ago

neild commented 6 months ago

Please see the updated proposal in https://github.com/golang/go/issues/67002#issuecomment-2368976062


Directory traversal vulnerabilities are a common class of vulnerability, in which an attacker tricks a program into opening a file that it did not intend. These attacks often take the form of providing a relative pathname such as "../../../etc/passwd", which results in access outside an intended location. CVE-2024-3400 is a recent, real-world example of directory traversal leading to an actively exploited remote code execution vulnerability.

A related, but less commonly exploited, class of vulnerability involves unintended symlink traversal, in which an attacker creates a symbolic link in the filesystem and manipulates the target into following it.

I propose adding several new functions to the os package to aid in safely opening files with untrusted filename components and defending against symlink traversal.


It is very common for programs to open a file in a known location using an untrusted filename. Programs can avoid directory traversal attacks by first validating the filename with a function like filepath.IsLocal. Defending against symlink traversal is harder.

I propose adding functions to open a file in a location:

package os

// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)

// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)

The OpenFileIn, OpenIn, and CreateIn family of functions safely open a file within a given location, defending against directory traversal, symlinks to unexpected locations, and unexpected access to Windows device files.


All modern Unix systems that I know of provide an openat call, to open a file relative to an existing directory handle (FD). Windows provides an equivalent (NtCreateFile with ObjectAttributes including a RootDirectory). Of the supported Go ports, I believe only js and plan9 do not support openat or an equivalent.

I propose adding support for openat-like behavior to os.File:

package os

// OpenFile opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFile returns an error.
func (f *File) OpenFile(name string, flag int, perm FileMode) (*File, error)

// Create creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) Create(name string) (*File, error)

// Open opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)

Like the top-level CreateIn, OpenIn, and OpenFileIn, the methods defend against accessing files outside the given directory. This is unlike the default behavior of openat, which permits absolute paths, relative paths outside the root, and symlink traversal outside the root. (It corresponds to Linux's openat2 with the RESOLVE_BENEATH flag.)

A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.


The above functions defend against symlink traversal that leads outside of the designated root directory. Some users may wish to defend against symlink traversal entirely. Many modern operating systems provide an easy way to disable symlink following: Linux has RESOLVE_NO_SYMLINKS, Darwin has O_NOFOLLOW_ANY, and some other platforms have equivalents.

I propose adding support for disabling symlink traversal to the os package:

const (
    // O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
    // or [File.OpenFile], disallows resolution of symbolic links anywhere in the
    // named file.
    //
    // O_NOFOLLOW_ANY affects the handling of symbolic links in all components
    // of the filename. (In contrast, the O_NOFOLLOW flag supported by many
    // platforms only affects resolution of the last path component.) 
    //
    // O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
    // parameter of [OpenFileIn].
    //
    // O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
    // or Plan 9 bind mounts.
    //
    // On platforms which support symbolic links but do not provide a way to
    // disable symbolic link traversal (GOOS=js), open functions return an error
    // if O_NOFOLLOW_ANY is provided.
    O_NOFOLLOW_ANY int = (some value)
)

O_NOFOLLOW_ANY may be passed to OpenFile, OpenFIleIn, or File.OpenFIle to disable symlink traversal in any component of the file name. For OpenFileIn, symlinks would still be permitted in the directory component.

On platforms which do not support the equivalent of O_NOFOLLOW_ANY/RESOLVE_NO_SYMLINKS natively, the os package will use successive openat calls with O_NOFOLLOW to emulate it. On platforms with no openat (plan9 and js), open operations will return an error when O_NOFOLLOW_ANY is specified.

seankhliao commented 6 months ago

is this essentially https://pkg.go.dev/github.com/google/safeopen with Beneath -> In ?

that also has a ReadFile / WriteFile variant which I'd use more then the create version.

neild commented 6 months ago

The design of this proposal is influenced by github.com/google/safeopen, but differs in a few areas. (Sorry, I really should have mentioned safeopen as prior art.)

Of the three parts of this proposal:

ReadFileIn and WriteFileIn seem like a useful and logical extension of this proposal.

dsnet commented 6 months ago

Yes, please. When I was working on safe file operations and it turned out to be hard to do correctly without OS support.

Without O_NOFOLLOW, you have to slowly check every segment for symlinks before traversing into it. For the naive implementation, how do you protect against TOCTOU bugs? At the moment that you check some path segment and verify that it's not a symlink (or a safe one) and then proceed to descend into it, some other process (or goroutine) could have asynchronously changed the target.

dsnet commented 6 months ago

What, if any, changes would be made to "io/fs"? Ideally, there is a mirror of these APIs in that package.

neild commented 6 months ago

If we wanted to extend this proposal to io.FS, I believe the one addition would be:

package fs

// An OpenFile is a directory file whose entries may be opened with the Open method.
type OpenFile interface {
  File

  // Open opens the named file in the directory.
  //
  // When Open returns an error, it should be of type *PathError
  // with the Op field set to "openat", the Path field set to name,
  // and the Err field describing the problem.
  //
  // Open should reject attempts to open names that do not
  // satisfy ValidPath(name), returning a *PathError with Err set to
  // ErrInvalid or ErrNotExist.
  Open(name string) (File, error)
}

A more interesting question is os.DirFS. Currently, DirFS has two documented limitations: It follows symlinks out of the directory tree, and if the FS root is a relative path then it will be affected by later Chdir calls.

I don't think we can change DirFS's symlink-following behavior: It's documented, and it's a behavior that a user could reasonably depend on.

The interaction between DirFS and Chdir seems less likely to be something a user would depend on, but it is documented. I'm not sure if we can change it at this point, but perhaps.

Perhaps we should add a version of DirFS that opens the directory root at creation time (retaining a handle to it even if the current working directory changes or the root is renamed), and refuses to follow symlinks out of the root. I'm not sure if that should be part of this proposal or a separate one.

adonovan commented 6 months ago

Perhaps the new names should include an At suffix to make clear to casual readers of the API that these are not the usual open system calls. Either way, the three new methods should probably reference some shared section of documentation on the concept of the at-suffixed operations.

neild commented 6 months ago

I presume you mean the new method names? The functions have an "In" suffix.

We could also include the In suffix on the methods; I waffled on whether it belongs there or not:

func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
func (f *File) CreateIn(name string) (*File, error)
func (f *File) OpenIn(name string) (*File, error)

I'm trying to avoid the suffix "At" to make it clear that none of these calls are precisely openat. openat(2) permits escaping from the root directory via absolute or relative paths, and doesn't do anything about symlink traversal. (Linux has openat2(2), which is quite configurable. The proposed *In functions are essentially openat2 with the RESOLVE_BENEATH flag.)

adonovan commented 6 months ago

Fair enough. Should the fs.OpenFile.Open method also be named OpenIn?

neild commented 6 months ago

Should the fs.OpenFile.Open method also be named OpenIn?

Probably, for consistency.

rsc commented 5 months ago

Are we missing RemoveIn?

neild commented 5 months ago

We should probably have RemoveIn as well:

// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error

// Remove removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error

Perhaps also RemoveAllIn?

rsc commented 5 months ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. ā€” rsc for the proposal review group

bjorndm commented 5 months ago

Maybe it would be better if the parent was an fs.FS? Seems more widely applicable, if somewhat more complex.

qmuntal commented 5 months ago

func RemoveIn(parent, name string) error

Note that Windows does not provide (AFAIK) an unlinkat counterpart. It will have to be emulated doing something like:

func RemoveIn(parent, name string) error
  f, err := os.OpenIn(parent, name)
  if err != nil {
    return err
  }
  return syscall.SetFileInformationByHandle(f.Fd(), syscall.FileDispositionInfo, ...)
}
hherman1 commented 5 months ago

As a user, when should I use os.Open vs os.OpenIn? Should I continue to default to os.Open, and only use OpenIn when I am actively avoiding a security issue, or should my default be OpenIn now?

neild commented 5 months ago

Maybe it would be better if the parent was an fs.FS?

The os package file functions operate on the local filesystem. fs.FS is an abstraction over a filesystem; it sits atop the os package functions, not under them.

If we want to add support for OpenIn on fs.FS filesystems, we would want something like https://github.com/golang/go/issues/67002#issuecomment-2075223022. We could add that to this proposal if we want, but for now I'm keeping this proposal focused on the os package.

Note that Windows does not provide (AFAIK) an unlinkat counterpart.

I think that's fine. This proposal requires varying degrees of implementation depending on platform already. (Linux has the very nice openat2 with RESOLVE_BENEATH, platforms without an equivalent are going to require us to do more work to produce equivalent behavior.)

If it's not possible to emulate unlinkat on Windows, that might be a problem, but it sounds like it should be possible.

As a user, when should I use os.Open vs os.OpenIn?

You should use OpenIn when you want to open a file within a directory.

I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.

bjorndm commented 5 months ago

The FS OpenFile looks good, yes. Somehow I skipped that comment, sorry.

magical commented 5 months ago

A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.

I don't really understand this. What is a program supposed to do if File.Open returns ErrUnsupported? Either it can give up and report an error to the user, meaning that the program simply doesn't work on plan9 or js, Or it can implement the fallback manually,

f, err := parent.Open(filename) 
if err == os.ErrUnsupported {
   f, err = os.OpenIn(parent.Name(), filename)
   //or even: os.Open(path.Join(parent.Name(), filename))
}

which is exactly the "hazardous" behaviour you say you're trying to avoid. If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback. Returning ErrUnsupported is just creating more work for developers for no tangible benefit.

I think this could be addressed perfectly well in the docs by saying that some platforms (linux, windows, etc) provide extra guarantees around renamed files, and that others (plan9 and js) do not.

neild commented 5 months ago

If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback.

The question is whether this is a reasonable fallback or not.

In the case of os.OpenIn, I think it's reasonable to fall back to a less-secure implementation. Lacking openat, we can statically validate the untrusted filename component for unintended traversal (os.OpenIn(dir, "../escapes")), and we can test for symlinks on the path, but we remain vulnerable to TOCTOU attacks. TOCTOU symlink attacks, where an attacker creates a symlink on the path while we're in the process of validating it, are an edge case and I think it's okay for us to support os.OpenIn on platforms where we can't defend against them (plan9 and js).

In the case of os.File.OpenIn, however, there are valid operations that we simply can't support without openat or the equivalent. With openat, you can open a directory, rename or even delete it, and then continue to access files in that directory. There's no way to simulate this with operations on the directory's filename.

Perhaps it's okay to say that os.File.OpenIn behaves differently on plan9 and js, and that users who need the ability to follow a directory across renames/deletes are responsible for not trying to do so on those platforms. Returning an error is the more conservative choice.

I note also that if you don't need the openat behavior of following a directory across renames, you don't need to use os.File.OpenIn at all--you can just always use os.OpenIn.

CAFxX commented 4 months ago

I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.

I would recommend adding guidance in the documentation of the not *In variants calling out that the *In variants exist and recommended for cases when directory escape is not desirable.

neild commented 3 months ago

Updated proposal, with comments on various changes arising from above discussion and working on implementation.

The OpenFileIn, CreateIn, and OpenIn functions are unchanged from the original proposal:

package os

// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)

// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)

The File methods now all have an In suffix: File.OpenFileIn, File.CreateIn, File.OpenIn. This is clearer overall: For example, f.CreateIn creates a file in the directory f, it doesn't create f. This also resolves an ambiguity between File.Stat and File.StatIn (see below).

package os

// OpenFileIn opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) CreateIn(name string) (*File, error)

// OpenIn opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)

To the above, we add MkdirIn, RemoveIn, and StatIn functions and methods. Creating directories and removing files are fundamental operations, and there's no reason to leave them out. DirFSIn (see below) provides a traversal-resistant Stat, so StatIn is included here as well.

Open question: Should we add LstatIn as well? How about SymlinkIn? RenameIn? ReadFileIn and WriteFileIn? On one hand, I don't want to let this proposal get out of hand with an endless array of new functions; on the other hand, some of these do seem useful. I'd appreciate proposal committee's thoughts on where we should draw the line with this proposal.

package os

// MkdirIn creates a new directory in the named parent directory
// with the specified name and permission bits (before umask).
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Mkdir].
func MkdirIn(parent, name string, perm FileMode) error

// MkdirIn creates a new directory in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) MkdirIn(name string, perm FileMode) error

// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error

// RemoveIn removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error

// StatIn returns a FileInfo describing the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Stat].
func StatIn(parent, name string) (FileInfo, error)

// StatIn returns a FileInfo describing the named file in the directory associated with  the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) StatIn(name string) (FileInfo, error)

We add os.DirFSIn, a traversal-safe version of os.DirFS.

Open question: DirFSIn or DirInFS? I prefer DirFSIn--"a directory filesystem in (root)", but internal discussion suggested DirInFS might be better.

For the moment, we do not add any new optional interfaces to io/fs, such as fs.OpenFile (see https://github.com/golang/go/issues/67002#issuecomment-2075223022).

There are many existing APIs, both in and out of the standard library, that operate on an io/fs.FS. Providing a traversal-resistant FS implementation is a simpler and more effective approach to hardening programs than requiring every API which operates on an FS to check for and use an OpenFile method.

Open question: It seems likely to me that we're going to want more variations on DirFS in the future. For example, it seems reasonable to want an FS that disallows symlink traversal entirely (essentially passing O_NOFOLLOW_ANY to every file open). Therefore, I think DirFSIn should either accept an options struct to allow for future customization, or should return a concrete type with customization methods. ( For example, fs := os.DirFSIn("root"); fs.SetFollowSymlinks(false)). The following returns a concrete type.

package os

// DirFSIn returns a filesystem for the tree of files rooted at the directory dir.
// The directory dir must not be "".
//
// Open calls will resolve symbolic links, but return an error if any link points outside the directory dir.
//
// The returned filesystem implements [io/fs.FS], [io/fs.StatFS], [io/fs.ReadFileFS], and [io/fs.ReadDirFS].
func DirFSIn(dir string) *FS

type FS struct{}
func (fs *FS) Open(name string) (File, error)
func (fs *FS) Stat(name string) (FileInfo, error)
func (fs *FS) ReadFile(name string) ([]byte, error)
func (fs *FS) ReadDir(name string) ([]fs.DirEntry, error)

The O_NOFOLLOW_ANY open flag remains unchanged.

Open question: Should we add os.O_NOFOLLOW? I only realized while implementing this proposal that it doesn't exist already. (Existing code which uses the flag uses syscall.O_NOFOLLOW.) On one hand, if we're supporting a portable O_NOFOLLOW_ANY, perhaps we should support a portable O_NOFOLLOW as well. On the other hand, O_NOFOLLOW can be dangerously surprising, since it only prevents symlink resolution in the final filename component, so perhaps we should stick to the more robust form.

const (
    // O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
    // or [File.OpenFile], disallows resolution of symbolic links anywhere in the
    // named file.
    //
    // O_NOFOLLOW_ANY affects the handling of symbolic links in all components
    // of the filename. (In contrast, the O_NOFOLLOW flag supported by many
    // platforms only affects resolution of the last path component.) 
    //
    // O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
    // parameter of [OpenFileIn].
    //
    // O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
    // or Plan 9 bind mounts.
    //
    // On platforms which support symbolic links but do not provide a way to
    // disable symbolic link traversal (GOOS=js), open functions return an error
    // if O_NOFOLLOW_ANY is provided.
    O_NOFOLLOW_ANY int = (some value)
)

Open question: How should we handle .. relative path components in filenames?

Consider the following directory tree:

On the Unix command line, if we cat s/../f, we print the contents of the file a/f.

If we open the current directory and openat(curfd, "s/../f"), we also open a/f.

The safeopen package cleans filenames prior to opening a file, so safeopen.OpenBeneath(".", "s/../f") opens the file f. The safeopen package also forbids symlink traversal entirely, so safeopen.OpenBeneath(".", "s/f") returns an error rather than opening a/b/f.

On Windows, things are confusing (and I'm still trying to understand what's going on under the hood): Using NtCreateFile to open a file in "." (the rough equivalent of Unix's openat):

It appears that NtCreateFile will resolve .. path components only if a symlink appears somewhere in the path. This is weird enough that I feel like I must be be missing something.

The question is: What should os.OpenIn(".", "s/../f") do in this case? Options I see include:

My current inclination is the first option above: Permit both symlinks and .. path components, and resolve each step of the path in sequence. (So s/../f opens a/f in the above example.) This may be a bit tricky to implement on Windows, but it should be possible.

I can, however, see a good argument for disallowing . and .. relative path components. This simplifies the implementation, there are few if any real-world cases where resolving paths like s/../f is necessary, and users can lexically clean paths with filepath.Clean if desired.


Open question: How should we handle platforms without openat or an equivalent, namely GOOS=plan9 and GOOS=js?

GOOS=js does not permit implementing OpenIn in a fashion free of TOCTOU races (swapping a directory component with a symlink elsewhere on the filesystem). I believe Plan 9 doesn't have symlinks; if that's the case, TOCTOU races are not a concern on it.

GOOS=js and GOOS=plan9 do not permit implementing File.OpenIn correctly. Opening a directory as f, renaming or deleting that directory, and then using f.OpenIn should act on the original directory. Without openat or an equivalent, we have no way to follow the directory handle and the best we can do is act on the original directory path.

I've argued above for supporting OpenIn on these platforms and not supporting File.OpenIn. I think that I've been convinced by arguments above that it's better to support as much of the API as possible, even if platform limitations prevent supporting all of it. I therefore propose that on js and plan9, f.OpenIn("path") behaves equivalently to os.OpenIn(f.Name(), "path").

bjorndm commented 3 months ago

This is rather extensive API. Perhaps a separate package from os would be better? Maybe os/in?

rsc commented 3 months ago

On a very minor note, Plan 9 can be considered to implement O_NOFOLLOW_ANY because there are no symlinks on Plan 9 at all.

rsc commented 3 months ago

More generally, I understand the motivation here, but the amount of new API is a bit daunting. I think we need to keep thinking about reducing the total amount of API. It seems like there needs to be some type representing the constrained file system. For this message, let's call it a Dir. It would be defined like:

// A Dir represents a root directory in the file system.
// Methods on a Dir can only access files and directories inside that root directory.
// Methods on Dir are safe to be used from multiple goroutines simultaneously.
// After Close is called, methods on Dir return errors.
type Dir struct {
   ...
}

func OpenDir(name string) (*Dir, error)

func (*Dir) FS() fs.FS
func (*Dir) OpenFile
func (*Dir) Create
func (*Dir) Open
func (*Dir) OpenDir
func (*Dir) Mkdir
func (*Dir) Remove
func (*Dir) MkdirAll
func (*Dir) RemoveAll
func (*Dir) Close

All the top-level convenience things like os.OpenIn can be left out. Code can use OpenDir followed by the operation it wants.

That at least feels like a more manageable amount of API.

I have been thinking for a while and have not come up with a name like more than Dir. It's certainly not perfect, and OpenDir would need a doc comment explaining that it's not opendir(3), but it's not bad.

mateusz834 commented 3 months ago

@rsc With this approach it would be also nice to define some globals, like CWD. Zig also has a simmilar abstraction in the std, also called Dir https://ziglang.org/documentation/master/std/#std.fs.Dir.

mateusz834 commented 3 months ago

Also

// Methods on Dir are safe to be used from multiple goroutines simultaneously.

Is this true for Close?

rsc commented 3 months ago

@mateusz834 Sure, Close can be called from multiple goroutines simultaneously. Same thing is true of os.File.

rsc commented 3 months ago

Re CWD, I disagree with having a global whose meaning changes each time the current directory changes. What is supposed to happen if I do this?

dir := os.CWD
os.Chdir("/tmp")
dir.Open("foo")

Neither answer is good.

If code wants the current current directory [sic], then it can use os.OpenDir("."). Then at least it is clearer what happens with Chdir.

ianlancetaylor commented 3 months ago

I vote for the name Root rather than Dir.

For the record, other possible methods:

Those are the top-level os package functions that take file names that can't otherwise be handled with complete safety. (There is also os.Stat and os.Readfile but those can be done with File methods.)

mateusz834 commented 3 months ago

@rsc I think that there is a special AT_FDCWD fd for that.

man openat(2):

If pathname is relative and dirfd is the special value AT_FDCWD, then pathname is interpreted relative to the current working directory of the calling process (like open(2)).

If pathname is absolute, then dirfd is ignored.

neild commented 3 months ago

I'm fine with dropping the top-level functions, although I note that "open this file within this directory, where the directory name is trusted and the file name is not" is a very common operation. Perhaps we could limit the top-level functions to CreateIn and OpenIn, for the most common operations?

What's the motivation for a new Dir type as opposed to adding additional methods to File? The number of methods is the same either way, and we've already got methods on File which only apply to directories (ReadDir, Readdir, Readdirnames). Plus io/fs.ReadDirFile defines a directory as a specialization of a file.

If we did want a new Dir (or Root) type, would we want a way to convert a File into a Dir? (f.Dir(), maybe?) And would we want a Dir.ReadDir method for listing the contents of a directory?

rsc commented 3 months ago

The motivation is to put all the "safety" stuff on one type instead of intermixing it with "non-safety" stuff. The names end up cleaner and the code using the APIs should be easier to vet. I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File. They should be separate concepts.

Put another way, although a Dir is an fd underneath and a File is an fd underneath, that doesn't mean the two are conceptually the same. The Dir type is a safety abstraction, and the fact that it is very like a File in its representation is an implementation detail.

I am sure that other helpers will be wanted on Dir too, including ReadDir, ReadFile, WriteFile. All the more reason for keeping Dir separate from File.

rsc commented 3 months ago

@ianlancetaylor I thought about Root as well but Root has the problem that you'd expect os.Root to be /. Or maybe uid 0. It's already overloaded. And RootedDir seems too long. It's interesting that Zig ended up on a type of the same name, although clicking around on https://ziglang.org/documentation/master/std/#std.fs.Dir I couldn't find anything like a doc comment explaining what concept Dir represents, if any.

ianlancetaylor commented 3 months ago

How about os.Rooted?

rsc commented 3 months ago

"Rooted" has another common meaning too. šŸ˜„

I'm pretty convinced about the API choice of separating all these routines into methods on one type, but I think we need to keep thinking about the name of that type.

neild commented 3 months ago

I can't think of a better name than Dir. Everything else is too overloaded. (FS is nice, but then that conflicts with io/fs.FS.)

I'm not entirely convinced that it's better to have two types rather than adding more methods to File, given that we already have extensive precedent for treating a File as a directory handle. I think there's going to be user confusion about when you want a File and when you want a Dir.

But perhaps it's fine.

mateusz834 commented 3 months ago

openat(2) allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH, or maybe this should be configurable? This way the os.Dir might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).

mateusz834 commented 3 months ago

RE CWD Zig has a CWD https://ziglang.org/documentation/master/std/#std.fs.cwd, in Zig you have to have a Dir to open a file. openat with AT_FDCWD works the same way as open does, so:

dir := os.CWD
dir.Open("foo")
os.Chdir("/tmp")
dir.Open("foo")

has the same behavior as:

os.Open("foo")
os.Chdir("/tmp")
os.Open("foo")
neild commented 3 months ago

openat(2) allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH, or maybe this should be configurable? This way the os.Dir might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).

I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.

mateusz834 commented 3 months ago

I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.

Yes, of course, but this might influence the API, at this point we have two choices:

If we don't consider that from the start, then we are only left with the second one.

rsc commented 3 months ago

Options should be in the Dir struct. You configure it, and then you use it with whatever methods you want. Want two different kinds of restrictions? Configure two Dir structs.

Otherwise you need all the kinds of options on all kinds of different methods. The whole point of the Dir struct is to hold all that complexity and leave you with the same functions (methods) and signatures as the top-level os funcs.

mateusz834 commented 3 months ago

openat2 was added in kernel 5.6, is is possible to support lower kernel versions? #67001

neild commented 3 months ago

Linux is the only platform with openat2, so we need an openat-based implementation anyway. Linux versions without openat2 can use it, or we can just use openat everywhere if detecting support for openat2 is difficult.

AGWA commented 3 months ago

I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File.

This only works if you know ahead-of-time if the file you're opening is a directory. I have code that opens a path that may or may not be a directory, and calls File.Stat to decide whether to Read or ReadDir it. If it's a directory, I'd like to be able to convert the File to a Dir and use the new, safer, functions to open its children.

(Unless Dir were allowed to represent non-directories and Open(".") were special-cased, which I think would be quite unintuitive.)

willfaught commented 3 months ago

The Dir type assumes you're rooted at a file descriptor, but the original design called for rooting at a string:

func OpenIn(parent, name string) (*File, error)

Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors? If not, then perhaps we shouldn't constrain the API design based solely on how it would be implemented on Linux. The idea of rooting based on a string path seemed appealing to me.

willfaught commented 3 months ago

Never mind, OpenDir(string).Open(string) is basically the same thing. :)

neild commented 3 months ago

Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors?

Every Unix-like platform we support has openat, including macOS. Some of them have extensions which make implementation simpler (but then we still need to support the lowest common denominator, so it's not actually any simpler), but basic openat is enough.

Plan 9 doesn't have openat, but Plan 9 doesn't have symlinks so lexical path analysis suffices.

Windows has NtCreateFile, which can act quite a bit like openat. I'm going to write up a longer description of Windows path behavior at some point, but the quick version is that it's possible to build these APIs (in any of the versions proposed so far) on Windows as well.

GOOS=js does not have openat or any equivalent (as best I can ascertain), and we can therefore only support a degraded version of these APIs on that platform.

As you've noted, OpenIn is essentially a convenience version of OpenDir(parent).Open(file). I think it's okay for us to leave it out to keep the API surface down, but most of the real-world path traversal vulnerabilities I've observed would have been fixed by changing an os.Open(filepath.Join(parent, file)) call to os.OpenIn(parent, file). Making the safe path as simple as possible seems worthwhile to me.

rsc commented 3 months ago

Let's try Root. Then we can have:

func OpenRoot(dir string) (*Root, error) 
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenDir
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Close
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Symlink
func (*Root) Truncate

And then there's an obvious name for the convenience function:

func OpenRooted(dir, name string) (*File, error) {
   r, err := OpenRoot(dir)
   if err != nil { return nil }
   return r.Open(name)
}

Does that look right?

rsc commented 3 months ago

We may want to drop Root.Symlink or else we have to say what it rejects (like absolute paths). (See the awful cases in #49580 if we go down that path.)

neild commented 3 months ago

LGTM. I'm partway through implementing (using the name Dir), and haven't run into any significant concerns yet.

Root.Symlink seems fine; it creates a symlink in the root with the given target. The contents of the target are up to the caller. We can be clear in the documentation that Root.Symlink isn't trying to impose any restrictions on the link target.

I do not have a strong opinion on Dir vs. Root as a name. Dir is less overloaded, Root raises fewer questions about why Files can also be directories.