dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.19k stars 4.72k forks source link

File open should have non-throwing alternatives #27217

Open jaredpar opened 6 years ago

jaredpar commented 6 years ago

Opening a file today in. NET today requires developers to introduce exception handling into their code. That is often unwanted and adds unnecessary complexity to otherwise straight forward code. .NET should offer non-throwing alternatives to facilitate these cases.

Rationale and Usage

File operations in .NET today require developers to introduce exception handling for even the simplest of operations. This is due to a combination of the operations being inherently unpredictable in nature and the BCL only provides exceptions as a way to indicate failure.

This is problematic because it forces exception handling into every .NET app which use the file system. In many cases developers would prefer to use simple if checks as it fits into the flow of their code. This is particularly true of lower level code which tends to be written with local control flow vs. exceptions. Using .NET though there is no way to avoid exceptions when dealing with the file system.

Additionally this is particularly annoying when debugging code in Visual Studio. Even when code has proper exception handling VS will still break File.Open calls when first chance exceptions are enabled (quite common). Disabling notifications for these exceptions is often not an option because in addition to hiding benign case it could hide the real error that you're trying to debug.

Many developers attempt to work around this by using the following anti-pattern:

static bool TryOpen(string filePath, out FileStream fileStream) {
    if (!File.Exists(filePath)) {
        fileStream = null;
        return false;

    }
    fileStream = File.Open(filePath);
    return true;
}

This code reads great but is quite broken in at least the following ways:

  1. The developer mistakenly assumes that File.Exists is a side effect free operation. This is not always the case. Example is when filePath points to a named pipe. Calling Exists for a named pipe which is waiting for a connection will actually complete the connection under the hood (and then immediately break the pipe).
  2. The File.Open call can still throw exceptions hence even with the File.Exists predicate. The file could be deleted between the call, it could be open as part of an NTFS transactional operation, permissions could change, etc ... Hence correct code must still use exception handling here.

A more correct work around looks like the following. Even this sample I'm sure is subtle to cases that would have an observable difference from just calling File.Open):

static bool TryOpen(string filePath, out FileStream fileStream) {
    if (!filePath.StartsWith(@"\\.\pipe", StringComparison.OrdinalIgnoreCase) && !File.Exists(filePath)) {
        fileStream = null;
        return false;
    }

    try { 
        fileStream = File.Open(filePath);
        return true;
    }
    catch (Exception) { 
        fileStream = null;
        return false;
    }
}

To support these scenarios .NET should offer a set of file system helpers that use return values to indicate failure via the Try pattern: File.TryOpen.

Proposed API

public static class File {
    public static bool TryOpen(
        string path, 
        FileMode mode, 
        FileAccess access, 
        FileShare share, 
        out FileStream fileStream);

    public static bool TryOpen(
        string path, 
        FileMode mode, 
        out FileStream fileStream);

    public static bool TryOpen(
        string path, 
        FileMode mode, 
        FileAccess access,
        out FileStream fileStream);

Details

tannergooding commented 6 years ago

CC. @JeremyKuhne, @pjanotti

roy-t commented 6 years ago

I think this is a good idea. But what about combining this with using?

Since a stream is disposable, I think a new pattern for opening files should work well with using statements. Do you have any ideas for that? Something like the below won't compile. (Using variables are immutable so they cannot be passed by ref).

(Or am I overlooking something?)

using (Stream stream = null)
{
  if (File.TryOpen("foo.bar", out stream)) { }
}
benaadams commented 6 years ago

Do you have any ideas for that? Something like the below won't compile.

This should work

if (File.TryOpen("foo.bar", out Stream stream)) 
{ 
    using (stream)
    {
        // do something with Stream
    }
}
migueldeicaza commented 6 years ago

I love the proposal, but would:

Clockwork-Muse commented 6 years ago

...how does this interact with the fact that Read/Write (and all their variants) can still throw I/O errors? A handle having been valid at one point won't guarantee it stays that way, right?

benaadams commented 6 years ago

Consider a tuple return “(Status status, FileStream output)”

Are you feeling TryOpenAsync with that pattern?

(Status status, FileStream output) = await File.TryOpenAsync("foo.bar");

Cloudflare: How we scaled nginx and saved the world 54 years every day

So I modified NGINX to do most of open() inside the thread pool as well so it won't block the event loop. And the result (both non-blocking open and non-blocking read):

ltrzesniewski commented 6 years ago

Why not just return null on failure, to make it easy to combine with using?

using (var stream = File.TryOpen("foo.txt"))
{
    // ...
}

I know it doesn't fit well with the bool TryXxx(..., out var result) convention, but I think the ease of use with using outweighs this.

When nullable types make it into the language, turn that into a FileStream?.

jaredpar commented 6 years ago

Why not just return null on failure, to make it easy to combine with using?

Once you eliminate exceptions there is really no way to have it work as easily with using. There is always an if check that has to happen. Consider both cases a) return null and b) return bool).

// Option A: return null
using (var stream = File.TryOpen("bah.txt")) {
  // Still have to account for stream being null before I can call Write, Close, etc ... 
  if (stream == null) { 
    return;
  }
}

// Option B: return bool
if (!File.TryOpen("blah.txt", out var stream)) { 
  return;
}
using (stream) { 
  ...
}
ltrzesniewski commented 6 years ago

That's true, although I'm worried the using will be easier to forget if you return bool. Maybe that's just me though.

brucificus commented 6 years ago

I realize it's probably a hair more "functional" than C# historically leans, but what about church encoding?

Given something like:

interface IFileOpenResult : IDisposable {
    T Match(Func<Stream, T> onSuccess, Func<T> onAccessDenied, Func<T> onSomeOtherFailureReason …);
}
public static class File {
    public static IFileOpenResult TryOpen(string path, …) { … }
}

Consuming code could be written in the form of:

using(var fileOpenResult = File.TryOpen(…)) {
    var dataOrFailureReason = fileOpenResult.Match(
        (s) =>  /* read & return data from stream */ ,
        onAccessDenied: () => null,
        onSomeOtherFailureReason: () => throw new Exception("I really wasn't expecting this!"));
}

I've personally used this pattern often in line-of-business apps in order to avoid using either exceptions or switch statements for control flow.

Some might find, though, that this brings a burden for consumers to have a suitable common return type from the matcher parameters. Third party libraries can help such consumers, but standard library support for things like Some, OneOf, etc. could mitigate some of this.

Of course, then we get into maybe wanting better language (C#, VB) support for some functional programming idioms (monoid comprehenions, Void as a first-class citizen, "closed" type hierarchies for better pattern matching of discriminated unions, etc.) But I digress.

knocte commented 6 years ago

tune to return an error code, so in a single go, i can handle different error conditions.

If the developer wants to handle error conditions, she should use the normal non-Try API, right? This is the way error conditions are propagated in .NET: exceptions. Otherwise, I don't get it.

MarcoRossignoli commented 6 years ago

tune to return an error code, so in a single go, i can handle different error conditions.

@migueldeicaza i like the idea of Status(for clever transient strategies) but i would put as non mandatory overload, something like:

public static class File {
    public static bool TryOpen(
        string path, 
        FileMode mode, 
        FileAccess access, 
        FileShare share, 
        out FileStream fileStream,
        out Status status);
...

public enum Status
{
     SUCCESS,
     FILE_NOT_FOUND,
     ERROR_SHARING_VIOLATION, <- or generic name
     ...
}
kumpera commented 6 years ago

Most code wants to stop on first error but continue unobstructed, so I think we should borrow from languages like Rust with Option/Result.

abatishchev commented 6 years ago

What about TryReadAllLines and TryReadAllText which return string[] and string respectively and don't have complication with using? Maybe start with them and add/implement TryOpen later? Once design is more matured, or C# will have better support for options, etc.

benaadams commented 6 years ago

What about TryReadAllLines and TryReadAllText which...

Doesn't help when probing the binary file to open from a set of possible paths

Clockwork-Muse commented 6 years ago

Most code wants to stop on first error but continue unobstructed, so I think we should borrow from languages like Rust with Option/Result.

Functional languages usually have additional support for Option/Result types (especially because those are implemented as discriminated unions). They usually also disallow null, because otherwise things go strange.

migueldeicaza commented 6 years ago

@benaadams I am not sure I follow the proposal, but if the idea was that we should additionally have an async version, I support the idea, but perhaps the name should be TryOpenAsync

GSPP commented 6 years ago

It can be made to work with using by switching what goes in the out parameter:

IOStatus status;
using (var stream = File.TryOpen("foo.txt", out status))
{
    // ...
}
peteraritchie commented 6 years ago

I like the idea at face value with the given history of the Try pattern. But, thinking through some scenarios, I'm no longer sure.

I think part of the problem is the ambiguity of File.Open. This ambiguity leads you to the need for something like TryOpen; but also telescopes the ambiguity, in my mind. I'd first try to address the ambiguity because there's multiple reasons why you'd call File.Open and thus multiple things you'd want to do when unsuccessful. This leads to questions about error codes, etc. Off the top of my head, I'd consider deprecating Open and focus on OpenRead, OpenWrite, and OpenText. But, with Try variants of those, their usage becomes more awkward in light of IDisposable.

peteraritchie commented 6 years ago

I love the proposal, but would:

  • tune to return an error code, so in a single go, i can handle different error conditions.

Wouldn't that defeat the purpose of the Try pattern? I think Try only works when there's one response to error.

knocte commented 6 years ago

Wouldn't that defeat the purpose of the Try pattern? I think Try only works when there's one response to error.

Completely agree, which is kinda the same thing I said in https://github.com/dotnet/corefx/issues/31917#issuecomment-415675239

AFAIK all Try methods in the BCL return a single success bool.

peteraritchie commented 6 years ago

@knocte yeah, I thought that what the pattern was about :)

Neme12 commented 5 years ago

@jaredpar Do you think you'd consider introducing an error enum for this API instead of just returning bool so that it can be used for scenarios like the one outlined in dotnet/runtime#926?

Scenario: I want to create and write some default data into a file unless it already exists.

where it's needed to treat very specific failures specially and completely ignore them because they are expected (as opposed to other errors).

danmoseley commented 5 years ago

If the developer wants to handle error conditions, she should use the normal non-Try API, right? This is the way error conditions are propagated in .NET: exceptions. Otherwise, I don't get it.

Per discussion in dotnet/runtime#926 this is not possible to do in a reasonable cross platform way, for many common cases.

jkotas commented 5 years ago

Per discussion in dotnet/runtime#926 this is not possible to do in a reasonable cross platform way

Right, it is not possible today. It should be possible once we introduce cross-platform IOError enum like what is suggested in dotnet/runtime#926.

The basic non-throwing API can be something like:

public static IOError TryOpen(
        string path, 
        FileMode mode, 
        out FileStream fileStream);
knocte commented 5 years ago

public static IOError TryOpen(

Would IOError need to be nullable for the success case? Or will you add a IOError.None member?

jkotas commented 5 years ago

SocketError.Success and WebSocketError.Success is existing prior art for having success code in error code enums.

MarcoRossignoli commented 5 years ago

@Neme12 @jkotas IMO I like bool TryXXX pattern we could also have result as out parameter like https://github.com/dotnet/corefx/issues/31917#issuecomment-415683689

jaredpar commented 5 years ago

@Neme12 I've thought about an enum approach previously but eventually settled on the bool version. The reason being that when looking at code I wrote I almost never cared about why a file operation failed, just that it failed or succeeded. Hence the bool approach seems more pragmatic (especially when you factor in the cross plat issues that others have brought up).

Neme12 commented 5 years ago

The cross platform issue isn't that we can't have cross platform error codes. It's that we don't have them. 😞 That's why a new enum would be useful.

@jaredpar What if the API returned bool but there was an overload that had an error code as an out parameter?

danmoseley commented 5 years ago

@jaredpar What if the API returned bool but there was an overload that had an error code as an out parameter?

I think this is the best solution. In many cases you don't care about the error code just success/failure and in that case you would use the overload with the single out parameter that is consistent with all other Try API.

jaredpar commented 5 years ago

What if the API returned bool but there was an overload that had an error code as an out parameter?

That's good by me. I think the 99% case is succeeded / failed but agree that 1% where the care about how it failed is important.

migueldeicaza commented 5 years ago

One possible consideration is to wait until C# gets discriminated unions, and then change our APIs to support the Result<T,E> idiom as seen in languages like Rust, F# and Swift:

Clockwork-Muse commented 5 years ago

@migueldeicaza

While I like discriminated unions, the problem I see with Result<T, E> is that we already have the equivalent error feature.

You know, exceptions.

Essentially we'd be propagating a second, "softer" form of exceptions, ones declared as part of the API. Given we already have exceptions, I think I'd prefer going the Midori route: methods either explicitly throw a specific set of exceptions that can be recovered from, or none. Things that can't be recovered from - which is usually from programmer error - cause abandonment (essentially, exit the running task/process/application).

The language is already starting to move in the direction where such a thing might be viable; we're getting non/nullable reference types, meaning NullReferenceException (which is - usually - caused by programmer error, should be fixed, and would cause abandonment under this model) should go away. You can write code to take Span<T>s or Memory<T>s for slices, which help avoid index-out-of-bounds... as would (hopefully) the upcoming Range feature.

MichalPetryka commented 2 years ago

So, what's the progress on this after 3 years?

huoyaoyuan commented 10 months ago

There should also be TryOpenHandle as lower-level abstraction. It's closer to the raw api call provided by OS.