golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.73k stars 17.63k forks source link

io/fs: add file system interfaces #41190

Closed rsc closed 3 years ago

rsc commented 4 years ago

In July, @robpike and I posted a draft design for file system interfaces. That doc links to a video, prototype code, and a Reddit discussion.

The feedback on that design has been almost entirely positive.

A few people raised concerns about the use of optional interfaces, but those are an established pattern in Go that we understand how to use well (informed in part by some earlier mistakes, such as optional interface like http.Hijacker with methods that cannot return an error to signal failure/unavailability).

A few people suggested radical redesigns of the os.File interface itself, but for better or worse installed base and the weight of history cautions against such drastic changes.

I propose to adopt the file system interfaces draft design for Go 1.16.

rsc commented 4 years ago

Accepting this proposal would also let us land the embedded files draft design in Go 1.16, which I've proposed in #41191.

tooolbox commented 4 years ago

A few people raised concerns about the use of optional interfaces

This is my concern.

It's not that the alternative (attempting to define all FS methods in one huge interface which packages may then implement as no-op) is better. Rather, my perception is that the ergonomics of this approach are poor enough that it won't achieve broad community adoption and the level of composability and general success that interfaces like io.Reader and io.Writer have.

For example, it's clear that I will be able to pipe a zip file to text/template, and that's good, but I'm concerned about more general composability of filesystems and files. I can wrap a stack of io.Reader with confidence, but with io/fs it seems like some middle layer may not have the right optional interfaces and I will lose access to functionality.

In spite of my concerns, it seems like the best approach available to Go at this time, and I anticipate it will be accepted given that the very exciting #41191 depends upon it.

However, I have this inkling that the advent of generics may allow a more powerful/robust/safe abstraction. Has any thought been given to this, or to how io/fs could evolve in a backwards-compatible fashion if/when that occurs? Again, not to hold up this proposal, but I think I would be more excited if I knew what the future held.

networkimprov commented 4 years ago

The feedback page: https://www.reddit.com/r/golang/comments/hv976o/qa_iofs_draft_design/?sort=new

I think this API looks promising... and would benefit from a prototype phase.

A lot of feedback was posted, but there's been rather light discussion of the comments, presumably because you can't subscribe to a Reddit thread, and/or many in Go's github-centered community don't frequent Reddit. It would help to see a review and analysis of feedback here, and perhaps a roadmap to likely future features.

Problems were identified with the FileInfo interface, ~but not discussed~ and are in discussion #41188. Timeouts and/or interrupts bear consideration.

Landing a prototype in x/ seems like a logical step before stdlib. Go has long been deliberative and conservative about new features. Is this urgent somehow?

FWIW, my Go apps make heavy use of the filesystem, on Windows, MacOS, and Linux.

earthboundkid commented 4 years ago

I think optional interfaces can work if there is a way to indicate that even though a method exists on a wrapper, it hasn't been implemented by the underlying wrapped type. Something like ReadFile(name string) ([]byte, error) needs to be able to return ErrNotImplemented so that the function calling it can say "Oh, well, then let me fallback to Open() + Read()." The main sin of the existing optional interfaces is that there's no way to signal "I have this method just in case the thing I'm wrapping implements it." This shortcoming really needs to be addressed in the io/fs optional interfaces.

earthboundkid commented 4 years ago

So, for the ReadFile top level func, I am proposing this implementation:


func ReadFile(fsys FS, name string) ([]byte, error) {
    if fsys, ok := fsys.(ReadFileFS); ok {
        b, err := fsys.ReadFile(name)
        if err != ErrNotImplemented { // Or errors.Is?
            return b, err
        }
    }

    file, err := fsys.Open(name)
    if err != nil {
        return nil, err
    }
    defer file.Close()
    return io.ReadAll(file)
}
rsc commented 4 years ago

Discussion of ErrNotImplemented has moved to #41198. I've marked @carlmjohnson's two comments above this one as well as @randall77's comment below this one as "off-topic" to try to funnel discussion over there.

randall77 commented 4 years ago

On Thu, Sep 3, 2020 at 10:15 AM Carl Johnson notifications@github.com wrote:

I think optional interfaces can work if there is a way to indicate that even though a method exists on a wrapper, it hasn't been implemented by the underlying wrapped type. Something like ReadFile(name string) ([]byte, error) needs to be able to return ErrNotImplemented so that the function calling it can say "Oh, well, then let me fallback to Open() + Read()." The main sin of the existing optional interfaces is that there's no way to signal "I have this method just in case the thing I'm wrapping implements it." This shortcoming really needs to be addressed in the io/fs optional interfaces.

The other way to handle this is to have a factory for the wrapper that returns a type with the correct methods on it.

type I interface { Foo() } type Optional interface { Bar() }

func NewWrapper(i I) I { if _, ok := i.(Optional); ok { return &wrapperWithBar{i: i} } return &wrapperWithoutBar{i:i} } type wrapperWithoutBar struct { i I } type wrapperWithBar struct { i I } func (w wrapperWithoutBar) Foo() { w.i.Foo() } func (w wrapperWithBar) Foo() { w.i.Foo() } func (w *wrapperWithBar) Bar() { w.i.(Optional).Bar() }

You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/41190#issuecomment-686632868, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUSAIFGSNLHB456YB5L6C3SD7FKNANCNFSM4QTHTZEA .

rsc commented 4 years ago

@networkimprov, see https://go.googlesource.com/proposal/+/master/design/draft-iofs.md#why-not-in-golang_org_x. This isn't worth much without the standard library integration. It also can't be used for embed.Files from x.

jimmyfrasche commented 4 years ago

I'd feel more comfortable with this if it included basic combinatory file systems either in io/fs or somewhere official like golang.org/x. They would still have the issue with not understanding nonstandard optional methods but they would at least be guaranteed to keep up with official optional methods.

The two file systems I'm thinking of are an overlay fs and one that can "mount" other file systems in subdirectories of its root. With those two you could stitch multiple fs together easily.

rsc commented 4 years ago

@jimmyfrasche I don't understand the difference between "an overlay fs" and "one that can mount other file systems in subdirectories of its root." I agree we should provide something like that, and we intend to. But those sound like the same thing to me. :-)

jimmyfrasche commented 4 years ago

I was thinking just:

func Overlay(fses ...FS) FS for the former and the latter would satisfy

interface {
  FS
  Mount(dirName string, fs FS) error
}

and not have any files other than those mounted.

rsc commented 4 years ago

Got it, thanks @jimmyfrasche: union vs replace.

networkimprov commented 4 years ago

The io/fs integration with stdlib, and embed.Files can all be prototyped in x/

I wasn't suggesting x/ as the permanent home.

EDIT: Also, Readdir() & FileInfo have performance problems and missing features. The replacement APIs need prototypes. A draft is in https://github.com/golang/go/issues/41188#issuecomment-686283661

muirdm commented 4 years ago

I have two comments. I found similar comments in the reddit thread, but didn't see a satisfying discussion/conclusion. Apologies if I missed previous conclusions.

Wrapping

I think we should consider an official mechanism to wrap fs.FS objects (and probably fs.File objects). For example, I want to wrap an fs.FS to track the total number of bytes read. I need to intercept calls to fsys.Open and calls to fsys.ReadFile, if implemented. I also don't want to lose any other optional interfaces such as fs.GlobFS. Based on my experience with http.ResponseWriter, this is commonly needed, but hard and tedious to do correctly

For a concrete idea to discuss, something like this:

type FSWrapper struct {
  FS
  OpenFunc func(name string) (File, error)
  ReadFileFunc func(name string) ([]byte, error)
  // ... all other extensions ...
}

func (w *FSWrapper) ReadFile(name string) ([]byte, error) [
  rf, ok := w.FS.(ReadFileFS)
  if !ok {
    return nil, errors.ErrNotImplemented
  }

  if w.ReadFileFunc != nil {
    return w.ReadFileFunc(name)
  } else {
    return rf.ReadFileFunc(name)
  }
}

Granted there are cases where a generic wrapper would expose extensions you don't want to pass through. Anyway, I think at least the proposal would benefit from discussion or FAQ addressing wrapping.

Writing

It seems like we are starting with read-only because that is the simplest interface that enables the motivating embed feature. However, in some sense writing is more fundamental because you have to write before you can read (only half joking). I worry writing will be relegated to second class citizenship forever due to optional interfaces. For example, the hypothetical OpenFile extension:

func OpenFile(fsys FS, name string, flag int, perm os.FileMode) (File, error)

OpenFile returns an fs.File which has no Write method. It seems a bit strange to always have to type assert to get a writable file. I think the eternal friction between io.Writer and fs.File as proposed will be more painful than starting with a broader proposal.

In particular, I think we should consider:

  1. Make "OpenFile" be the core of fs.FS instead of "Open". OpenFile is more fundamental to file systems. We can add "func Open(fsys FS, name string) (File, error)" as a package function to emulate the simplicity of the proposed FS.Read method.
  2. Include "Write" in the fs.File interface. Write is as fundamental as Read for file systems.
Cyberax commented 4 years ago

Guys, PLEASE just add context everywhere! It costs nothing to ignore it or add context.TODO() for callers, but it will make life of network filesystem implementers and users much easier. In particular, it's needed for better contextual logging and cancellation.

You're all strictly opposed to thread-local variables, but then why are you designing APIs without a way to pass a context?!?

networkimprov commented 4 years ago

Deadlines and interrupts have been suggested as another way to solve the same problem, without affecting every function signature. It's unlikely that the os package will add dozens of new APIs with Context, see also #41054.

Deadlines comment Interrupts comment

Merovius commented 4 years ago

I think the simplest way to solve this is to pass a context on filesystem creation. So instead of having a type implementing fs.FS directly, it would have a method WithContext(context.Context) fs.FS, which returns a child-instance bound to a given context.

networkimprov commented 4 years ago

Cancelling all pending ops by the stdlib file API (which will implement fs.FS) is not desirable. It probably isn't useful for other FS types, as well. The common case, in my experience, is interrupting any pending ops trying paths within a tree rooted at a certain path. An API for that looks like one of:

(f *MyFs) SetDeadline(t time.Time, basepath string) error // if deadline past, interruption is immediate

(f *MyFs) InterruptPending(basepath string) error

Note that os.File already supports Read & Write deadlines.

I doubt that io/fs wants context as a dependency. Where needed, you could easily wire context into an fs.FS implementation to do one of the above.

Cyberax commented 4 years ago

Gah. The deadlines/interrupts design is just horrible. No, it's seriously horrible. The whole idea for not including thread IDs in Golang was to make sure APIs are forced to deal with clients potentially running in multiple goroutines.

Introducing the per-FS state will defeat this purpose, making the FS object behave more like a TCP connection rather than a dispatcher for an underlying FS. And only one goroutine at a time would be able to use it, otherwise they might step on each others' toes with deadlines. Never mind the badness of introducing a hidden state where it arguably shouldn't even be in the first place.

What are the actual downsides of simply adding context.Context to every method?

Cyberax commented 4 years ago

I think the simplest way to solve this is to pass a context on filesystem creation. So instead of having a type implementing fs.FS directly, it would have a method WithContext(context.Context) fs.FS, which returns a child-instance bound to a given context.

This will require the FS implementation to be a thin wrapper that supplies context to the underlying implementation. Certainly doable, but still ugly.

And it will still introduce dependency on context.Context in the FS code.

tv42 commented 4 years ago

@Cyberax All you need is f, err := fsys.WithContext(ctx).Open(p) to make that one open file obey that one context. Easy sharing.

This has been before with x.IO(ctx) returning io.Reader or such, to keep the io.Reader interface. It's a pretty simple layer.

networkimprov commented 4 years ago

So @tv42 that's (f *MyFs) WithContext(context.Context) *MyFs ? That's reasonable.

Cyberax commented 4 years ago

@Cyberax All you need is f, err := fsys.WithContext(ctx).Open(p) to make that one open file obey that one context. Easy sharing.

Not unless you want to do FS wrapping, but that's already been mentioned here. To expand this a bit, currently FS is supposed to consist of multiple optional interfaces (such as ReadFileFS), and the WithContext method's signature will have to use the most basic interface (FS).

So your example will actually be: f, err := fsObject.WithContext(ctx).(ReadFileFS).ReadFile(name) - it's NOT typesafe at all.

This has been before with x.IO(ctx) returning io.Reader or such, to keep the io.Reader interface. It's a pretty simple layer.

The whole TCP and the general IO layer in Go is a mess, so it's not at all a good example. Witness the number of questions on Google about cancelling IO operations on TCP connections (via SetDeadline).

Cyberax commented 4 years ago

And let me remind everybody about Go's own style guide: https://github.com/golang/go/wiki/CodeReviewComments#contexts

A function that is never request-specific may use context.Background(), but err on the side of 
passing a Context even if you think you don't need to. The default case is to pass a Context; 
only use context.Background() directly if you have a good reason why the alternative is a mistake.

Don't add a Context member to a struct type; instead add a ctx parameter to each method on 
that type that needs to pass it along. The one exception is for methods whose signature must 
match an interface in the standard library or in a third party library.
icholy commented 4 years ago

So your example will actually be: f, err := fsObject.WithContext(ctx).(ReadFileFS).ReadFile(name) - it's NOT typesafe at all.

I think that would be:

data, err := fs.ReadFile(fsys.WithContext(ctx), "name")

Also, what's preventing WithContext from returning a concrete type?

networkimprov commented 4 years ago

Re Context everywhere, the stdlib file API has dozens of functions, and even if you replicate them all to add a Context argument, every package that calls any of them would have to replicate its own API to add Context. It's just not viable.

I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError in whatever goroutines invoked them.

Cyberax commented 4 years ago

I think that would be:

data, err := fs.ReadFile(fsys.WithContext(ctx), "name")

Sure, putting the code inside a helper function will help in this one particular case. But it won't help with the Stat interface and other optional interfaces defined in the spec.

Also, what's preventing WithContext from returning a concrete type?

Because it's going to be defined in the interface Contexter (or something like it) and it can't have the knowledge of the concrete type.

tooolbox commented 4 years ago

Because it's going to be defined in the interface Contexter (or something like it) and it can't have the knowledge of the concrete type.

Perhaps this is a silly suggestion, and I know this doesn't help us now, but as I mentioned near the start of this thread, would generics help with this particular issue?

Cyberax commented 4 years ago

Re Context everywhere, the stdlib file API has dozens of functions, and even if you replicate them all to add a Context argument, every package that calls any of them would have to replicate its own API to add Context. It's just not viable.

This is a new API, so I don't see the problem with adding context to it. Existing stdlib code that needs to touch files can either scrounge up the context from somewhere (for example, the HTTP server code has it) or just put context.TODO().

I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError in whatever goroutines invoked them.

Your API is not thread-safe. It's perfectly possible to open the same file twice from two different goroutines, but your interruption code will affect both of them.

It's also completely incorrect on Unix, because name doesn't uniquely identify a file, so it's possible to do:

fl := fs.Open("file.a")
fs.Move("file.a", "file.b")
fl2 := fs.Open("file.a", O_CREATE)
...
fl.Read(...)
fl2.Read(...)

fs.Interrupt("file.a") // Should we interrupt both reads?
Cyberax commented 4 years ago

Perhaps this is a silly suggestion, and I know this doesn't help us now, but as I mentioned near the start of this thread, would generics help with this particular issue?

No, not in this particular case.

Cyberax commented 4 years ago

Re: the stdlib API argument. I've just searched all the uses of os.Open* in the stdlib. Here are my findings:

Basically, if the standard library is refactored to use the new FS API, it can be done in minutes just by adding context.Background() to most of the places without changing any important semantics. And some code (e.g. HTTP client for multipart uploads) will actually be improved by adding full cancellation support.

ianlancetaylor commented 4 years ago

I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError in whatever goroutines invoked them.

  1. Because explicit is better than implicit.
  2. Because if every file operation is interruptible, then every file operation has to pay the cost of preparing for and handling interrupts, even though even in the best case only a very small percentage could ever possibly be interrupted.
  3. Because if every file operation is interruptible, then we have to modify every file operation in the standard library anyhow, so it's actually not all that much additional work to add a new API with a context.Context argument.
networkimprov commented 4 years ago

If you want existing packages to use fs.File types, Context arguments aren't an option. But we still need to be able to interrupt ops on those files. It's fine if you don't like InterruptPending("/some/tree"), But can we please hear alternatives that let existing code benefit?

It's perfectly possible to open the same file twice from two different goroutines, but your interruption code will affect both of them.

As intended! The InterruptPending() API applies to a tree, not a file. You interrupt all file ops on a tree only when something's gone wrong, and the only good option is to stop them all -- because they won't return otherwise. A more flexible API could let you inspect a table of pending ops, and choose which to interrupt.

explicit is better than implicit

Ha, Go has a metric tonne of implicit behavior. That aside, apps calling packages that use today's file ops need a way to interrupt them implicitly, because explicit isn't an option.

every file operation has to pay the cost of preparing for and handling interrupts

I believe this is sufficient: if err != nil { return err }. Only a module that calls InterruptPending() must handle InterruptError.

then we have to modify every file operation in the standard library anyhow

Haven't you conflated changing internals with changing APIs? I'm not categorically opposed to new APIs, but many third party packages won't get around to using them.

Still waiting to hear a good argument :-)

networkimprov commented 4 years ago

@rsc are there ways to let an app change the default FS used by os.Open() for a specific file -- or all files?

It's common to pass filenames to package functions; we may also need to set the FS object where those filenames reside.

Perhaps a filename prefix starting with a character not allowed in filenames? Rather a hack, tho.

Merovius commented 4 years ago

@networkimprov

The common case, in my experience, is interrupting any pending ops trying paths within a tree rooted at a certain path.

Do you have any data to support that assertion? Because I have never seen that use case and I can't think of a case where this would be useful. And in RPC servers it's certainly counterproductive, because you would break other, concurrent RPCs.

Still waiting to hear a good argument :-)

FTR, this isn't a very productive way to discuss.

are there ways to let an app change the default FS used by os.Open() for a specific file -- or all files? It's common to pass filenames to package functions

The goal of this proposal is to make that less common - or to pair it up with passing an fs.FS if one isn't already known. Doing some implicit side-loading of fs.FS dependencies seems counter to Go's spirit (see also Ian's "explicit is better than implicit").

networkimprov commented 4 years ago

I have never seen that use case and I can't think of a case where this would be useful.

A file tree rooted on a network filesystem will cease responding if the network or server goes down, a pretty commonplace event. A single file op could stall due to congestion, so perhaps InterruptPending(path) is part of a broader API.

Still waiting to hear a good argument :-)

Apologies if that seemed pejorative. This debate has been going across three issues (see https://github.com/golang/go/issues/40846#issuecomment-676777624 and #41054) and no one's addressed the concerns I've raised re Context arguments, nor carefully considered the idea of terminate/interrupt APIs. I'm not sure that anyone on the Go team even agrees that file ops should be interruptible, even tho at least three of us have stated cases for it, and it is a feature of Linux CIFS & FUSE and Windows I/O.

implicit side-loading of fs.FS dependencies seems counter to Go's spirit

I asked because anything that expected *os.File could take fs.File by changing a single type. It would be nice if changing os.Open() to xfs.Open() were similarly trivial. In any case, we may need type FileName struct { f FS; p string }.

networkimprov commented 4 years ago

Floating another idea... the pending ops table:

type PendingFS interface {
   FS
   TrackPendingOps(bool)
   ListPendingOps() []PendingOp
   InterruptOp(op ...PendingOp)
}

type PendingFile interface {
   File
   ListPendingOps() []PendingOp
}

type PendingOp interface {
   Op() string
   Pathname() string
   Params() []interface{}
   String() string
}
ianlancetaylor commented 4 years ago

If you want existing packages to use fs.File types, Context arguments aren't an option. But we still need to be able to interrupt ops on those files. It's fine if you don't like InterruptPending("/some/tree"), But can we please hear alternatives that let existing code benefit?

Existing code doesn't expect file operations to be interrupted. So I don't think there is a strong argument for providing some mechanism that works with existing code.

I agree that if we add Context parameters to io/fs, then we will, over time, want to add them to the standard file system operations as well. Or we'll want to add some other mechanism to permit interruption. This is not necessarily infeasible. Whether it is the right thing, I don't know. But to me it seems more likely to be the right thing to do than adding a global hammer that affects all inflight file system operations.

explicit is better than implicit

Ha, Go has a metric tonne of implicit behavior.

That is not a counter-argument.

That aside, apps calling packages that use today's file ops need a way to interrupt them implicitly, because explicit isn't an option.

I think that it is an option. I don't know if it is the right option, but I see no reason to reject it out of hand.

every file operation has to pay the cost of preparing for and handling interrupts

I believe this is sufficient: if err != nil { return err }. Only a module that calls InterruptPending() must handle InterruptError.

What I mean is that if we need some mechanism to interrupt every inflight file operation, then every file operation needs to somehow register itself before starting and then deregister itself when done. Otherwise the runtime will have no way to interrupt it. That is the cost that must be paid by every file operation. But, as I said earlier, even in the best case only a very small percentage of file operations could ever possibly be interrupted.

I'm not categorically opposed to new APIs, but many third party packages won't get around to using them.

Those third party packages already do not expect their file operations to be interrupted. Given that file operations fail in predictable ways, there is no special reason to expect that packages are prepared to correctly handle a new kind of file operation failure. I don't see supporting existing third party packages with no change as a priority.

no one's addressed the concerns I've raised re Context arguments, nor carefully considered the idea of terminate/interrupt APIs

I really can't agree with this. People have addressed those concerns, and they have considered those ideas. You don't agree with what people have said, but that doesn't mean that your thoughts haven't been considered.

Cyberax commented 4 years ago

A file tree rooted on a network filesystem will cease responding if the network or server goes down, a pretty commonplace event. A single file op could stall due to congestion, so perhaps InterruptPending(path) is part of a broader API.

I actually have code right now that has something similar. It uses an NFS disk as a cache in front of a DynamoDB table. The code first tries to read the data from this disk (using an NFS client in Go) and if the data doesn't come back within 2ms, it cancels the request and makes (an expensive) request to DynamoDB. Cancelling all requests in a subtree would not help a bit.

Merovius commented 4 years ago

@networkimprov

no one's addressed the concerns I've raised re Context arguments

I'm not sure what they are. So far, I've seen a) it adds a dependency on context to os and b) *os.File doesn't allow cancellation yet, as syscalls can't be interrupted.

But can we please hear alternatives that let existing code benefit?

I haven't seen that comment before. I already mentioned an alternative when you wrote that. It would allow supporting cancellation on a per-filesystem basise - that is, without any need to add context.Context to existing os implementations. Note that this solves both the concerns I mention above. It allows cancellable and not-yet-cancellable implementations to provide the same API. If a not-yet-cancellable implementation implements cancellation, it can add that support by exposing func WithContext(context.Context) *FooFS.

networkimprov commented 4 years ago

I already mentioned an alternative [to Context in the FS API]

Axel, so you did, thanks for that. It seems that omitting Context from the FS API is more popular than including it.

Existing code doesn't expect file operations to be interrupted. So I don't think there is a strong argument for providing some mechanism that works with existing code. I don't see supporting existing third party packages with no change as a priority.

Ian, I'm sure some packages would need changes to accommodate interrupt errors initiated by a caller. In others, the std practice of returning the error is sufficient. In either case, a package should have to alter its APIs as a last resort! It's surprising and disappointing to hear that you disagree.

[Context] seems more likely to be the right thing to do than adding a global hammer that affects all inflight file system operations

I just offered a scalpel :-) https://github.com/golang/go/issues/41190#issuecomment-687632101. I'll file a proposal for that in a day or so.

explicit is better than implicit

Much of the rationale of Go is that implicit behavior saves complexity in user code, so that assertion is not generally true.

every file operation needs to somehow register itself before starting and then deregister itself when done

Oh, yes. Since a blocking op may entail a thread creation and/or context switch, I guessed that maintaining a table of ops would be relatively inexpensive. And you wouldn't do it unless the app requested the ability to interrupt things.

no one's addressed the concerns I've raised re Context arguments, nor ... considered the idea of terminate/interrupt APIs

People have addressed those concerns, and they have considered those ideas.

Your opening argument to me on this topic was that network filesystems are irrelevant for Go programs -- https://github.com/golang/go/issues/40846#issuecomment-678560324. That was startling, and even insulting; I don't know my end-users' environments? Have you changed your thinking? Is there a problem to be addressed here?

If so, can we try to support existing packages before banishing them from apps that depend on network filesystems?

ianlancetaylor commented 4 years ago

Your opening argument to me on this topic was that network filesystems are irrelevant for Go programs -- #40846 (comment). That was startling, and even insulting; I don't know my end-users' environments? Have you changed your thinking? Is there a problem to be addressed here?

I apologize if that seemed insulting. That was certainly not my intent.

My position was and remains that if you know beforehand that your program is going to have to run on a network file system, you will be better off using a client server protocol rather than relying on the network file system protocol.

The point of a network file system is to hide the network. Decades of experience have shown us that the network is hard to hide. Better to embrace it rather than hide it.

diamondburned commented 4 years ago

My position was and remains that if you know beforehand that your program is going to have to run on a network file system, you will be better off using a client server protocol rather than relying on the network file system protocol.

If this is a reason not to support the context API (or in general a sane way of deadline/cancellation), then what about slow disk IO? Wouldn't those still be blocking and thus should still have some sort of deadline as well?

networkimprov commented 4 years ago

Quoting @bcmills from https://github.com/golang/go/issues/40846#issuecomment-679195944:

"If a program does not know whether it may be running on a networked file system, then it must use ordinary file system calls (because it may be on an ordinary file system), but it must also provide for canceling stalled operations (because it may be on a networked filesystem)."

This describes any app whose authors don't control its deployment environments.

ianlancetaylor commented 4 years ago

@diamondburned I'm fine with adding a context.Context or a deadline to file operations.

networkimprov commented 4 years ago

I filed #41249. It offers a few ways to interrupt stalled file ops, including two that take context.Context.

networkimprov commented 4 years ago

Would Context arguments mean spawning a goroutine for any synchronous io/fs op, to wait on Context.Done()? If not, how does the op obtain a cancellation notification?

If so, isn't a goroutine per op rather significant overhead for use of a filesystem?

bcmills commented 4 years ago

@networkimprov, it's true that an external library accepting a context would potentially need to start a goroutine per operation, but that library could then either be used as the prototype for a standard-library API (which can hook into the context package internally to provide cancellation without a separate goroutine), or could provide solid evidence for a new exported hook in the context package (as originally proposed in #28728).

rsc commented 4 years ago

This issue is about generalizing the FS API we have. It is not about designing a whole new API with bells and whistles for cancellation, deadlines, hung network file systems, and so on.

Unpopular opinion, maybe, but @ianlancetaylor is right: If your network service is not reliable enough to maintain the illusion of being like an ordinary on-disk file system, the solution is not to duplicate all the network concepts into the file system layer. Instead, the solution in that case is to use explicit network RPCs that already have all those concepts.

There are plenty of possible file system implementations that are reliable enough to maintain the illusion of an ordinary on-disk file system, and the generalized API in this proposal is targeted at those.

rsc commented 4 years ago

@networkimprov:

The io/fs integration with stdlib, and embed.Files can all be prototyped in x/

It cannot, because stdlib cannot import x/.

@rsc are there ways to let an app change the default FS used by os.Open() for a specific file -- or all files?

Package os is about os-provided files only. If you have an *os.File, you need to be sure that's what it really is - an operating system file. Otherwise you can't reliably do things like pass f.Name() to another process, or start another process with that file as one of its file descriptors, and so on.

It's common to pass filenames to package functions; we may also need to set the FS object where those filenames reside.

Yes, indeed. Package functions that want to work on arbitrary FS implementations need to be extended to take an FS, path pair, just as the text/template and html/template packages do in the prototype code.

Perhaps a filename prefix starting with a character not allowed in filenames? Rather a hack, tho.

That would imply some kind of a global registration table, which turns out to be a good thing to avoid. Just because I'm using a zip file as if it were an FS doesn't mean I want other parts of the program (or evaluation of command-line arguments) to be able to make up just the right path name and get at those same files.