dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.95k stars 4.65k forks source link

Issues with manipulating filesystem entries that are symlinks #60851

Open hamarb123 opened 2 years ago

hamarb123 commented 2 years ago

This is more of an issue now that symlinks are/are becoming officially supported by .NET.

Discussed somewhat here.

On Windows, you can know if a particular path is a file or directory and cache the value (assuming it isn't deleted). Because of this, you can use either the FileInfo class, DirectoryInfo class, File class, or Directory class - you can even just use a FileSystemInfo for any common APIs (eg. Exists, Delete, Move, etc.). It would be nice for completeness' sake if there was a static equivalent of FileSystemInfo (like the File class), but it is not much of an issue on Windows since you can just one or the other.

On Unix however, since we're now starting to support symlinks properly, there is an issue with the way that directory symlinks are treated which is caused by Unix having no distinction between file and directory symlinks.

Consider the following code:

var path = "/path/to/directory";
Directory.CreateDirectory(path);
var fsi = Directory.CreateSymbolicLink(path + ".link", path);
Directory.Delete(path);
Console.WriteLine(fsi.Exists);

On Windows, this code would return true, but on Unix this will return false because it is now considered a file symlink (you'd have to run fsi = new FileInfo(fsi.FullName) to get it to say true).

On Unix, you might cache a FileSystemInfo and then try to use it later, only to discover that it "doesn't exist", which could be bad for logic. You might be trying to delete a FileSystemInfo and discover that it "doesn't exist" which isn't good if it does exist. I don't think there's much that can be done here since it would be a breaking change, but perhaps it could allow common apis on both FileInfos and DirectoryInfos on Unix if it is a symlink (but this could have other issues).

On the other hand (ie. static apis), if you want to delete a path that is a symlink that you know exists, then you could try using File.Delete(path), but that wouldn't work if a directory got created as the target while you weren't looking, so you could try:

if (File.Exists(path)) File.Delete(path);
else if (Directory.Exists(path)) Directory.Delete(path);

which would work most of the time, but can cause 4x api calls than necessary and can cause a race condition if another process deletes/creates an unrelated file/directory while your code is running.

I think that there should be both static and instance apis to address this, but personally, the static apis are a bit more important to me, and there are possibly some issues with the instance apis since there is likely code that checks if a FileSystemInfo is of type FileInfo and DirectoryInfo to determine what it's meant to represent which possibly should be different for reparse points after their target is deleted/created.

@Jozkee suggested that the File class could be enabled to for this functionality for more than just what .NET currently considers a file, which would work for most APIs, except Exists since changing what it returns to be true for any sort of directory is a massive breaking change, perhaps there could be a new overload for this method and other methods with a similar issue with an extra parameter specifying when it should return true. It would also be quite useful if there was an exists-like method that could return No, File, Directory, and possibly ReparsePoint (perhaps as an overload with out bool) as well rather than just yes or no.

My personal use-case is a folder backing up program. In this program there are possible significant changes between enumerating the FileSystemInfos and using them which is fine for normal files and directories since at most they will cease to exist, but for the symlinks, they may become another type at any point. I currently use a helper class for the reparse points which is unavoidable and provides some benefit anyway (since they need different behaviour for some APIs), but the issue is when it comes to doing something with the reparse point, since I have to (whenever I suspect changes ie. I call APIs on any other files/directory in-between) first check whether it is a file or a directory before calling APIs. Note: these API calls can be delayed at any point due to requiring user input but not wanting to pause the program, so another check is required if it is delayed. Consider the following code:

var entries = ...; //enumerate the entries of a folder, modified by other functions (eg. sorting)
foreach (var entry in entries)
{
//do something depending on source directory eg. Delete, Move etc.
}

During this code, you would have to check if the entry is a reparse point and then check if it is still a file/folder before you do anything since you could have done something to its target, and then if so, make a new instance before you pass it to your other functions, meanwhile you still have a race condition and ugly code that should be unnecessary.

If you wanted to delete it for example, then it could be solved with an API that just deletes it without worrying if you're claiming it's a file (on Unix at least). And many of the other APIs (specifically common APIs) would also easily be able to do the same on Unix (could be easily emulated on Windows I think).

So essentially what I'm asking for is some kind of FileSystemEntry.Delete(path) (and other common APIs), whether it's a new set of APIs or just allowing the File class to do this.

Thanks for considering this and reading to here :)

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.

Issue Details
This is more of an issue now that symlinks are/are becoming officially supported by .NET. Discussed somewhat [here](https://github.com/dotnet/runtime/pull/52639#discussion_r735955640). On Windows, you can know if a particular path is a file or directory and cache the value (assuming it isn't deleted). Because of this, you can use either the `FileInfo` class, `DirectoryInfo` class, `File` class, or `Directory` class - you can even just use a `FileSystemInfo` for any common APIs (eg. `Exists`, `Delete`, `Move`, etc.). It would be nice for completeness' sake if there was a static equivalent of `FileSystemInfo` (like the `File` class), but it is not much of an issue on Windows since you can just one or the other. On Unix however, since we're now starting to support symlinks properly, there is an issue with the way that directory symlinks are treated which is caused by Unix having no distinction between file and directory symlinks. Consider the following code: ```cs var path = "/path/to/directory"; Directory.CreateDirectory(path); var fsi = Directory.CreateSymbolicLink(path + ".link", path); Directory.Delete(path); Console.WriteLine(fsi.Exists); ``` On Windows, this code would return `true`, but on Unix this will return `false` because it is now considered a file symlink (you'd have to run `fsi = new FileInfo(fsi.FullName)` to get it to say true). On Unix, you might cache a `FileSystemInfo` and then try to use it later, only to discover that it "doesn't exist", which could be bad for logic. You might be trying to delete a `FileSystemInfo` and discover that it "doesn't exist" which isn't good if it does exist. I don't think there's much that can be done here since it would be a breaking change, but perhaps it could allow common apis on both `FileInfo`s and `DirectoryInfo`s on Unix if it is a symlink (but this could have other issues). On the other hand (ie. static apis), if you want to delete a path that is a symlink that you know exists, then you could try using `File.Delete(path)`, but that wouldn't work if a directory got created as the target while you weren't looking, so you could try: ```cs if (File.Exists(path)) File.Delete(path); else if (Directory.Exists(path)) Directory.Delete(path); ``` which would work most of the time, but can cause 4x api calls than necessary and can cause a race condition if another process deletes/creates an unrelated file/directory while your code is running. I think that there should be both static and instance apis to address this, but personally, the static apis are a bit more important to me, and there are possibly some issues with the instance apis since there is likely code that checks if a `FileSystemInfo` is of type `FileInfo` and `DirectoryInfo` to determine what it's meant to represent which possibly should be different for reparse points after their target is deleted/created. @Jozkee suggested that the `File` class could be enabled to for this functionality for more than just what .NET currently considers a file, which would work for most APIs, except `Exists` since changing what it returns to be `true` for any sort of directory is a massive breaking change, perhaps there could be a new overload for this method and other methods with a similar issue with an extra parameter specifying when it should return `true`. It would also be quite useful if there was an exists-like method that could return No, File, Directory, and possibly ReparsePoint (perhaps as an overload with `out bool`) as well rather than just yes or no. My personal use-case is a folder backing up program. In this program there are possible significant changes between enumerating the FileSystemInfos and using them which is fine for normal files and directories since at most they will cease to exist, but for the symlinks, they may become another type at any point. I currently use a helper class for the reparse points which is unavoidable and provides some benefit anyway (since they need different behaviour for some APIs), but the issue is when it comes to doing something with the reparse point, since I have to (whenever I suspect changes ie. I call APIs on any other files/directory in-between) first check whether it is a file or a directory before calling APIs. Note: these API calls can be delayed at any point due to requiring user input but not wanting to pause the program, so another check is required if it is delayed. Consider the following code: ```cs var entries = ...; //enumerate the entries of a folder, modified by other functions (eg. sorting) foreach (var entry in entries) { //do something depending on source directory eg. Delete, Move etc. } ``` During this code, you would have to check if the entry is a reparse point and then check if it is still a file/folder before you do anything since you could have done something to its target, and then if so, make a new instance before you pass it to your other functions, meanwhile you still have a race condition and ugly code that should be unnecessary. If you wanted to delete it for example, then it could be solved with an API that just deletes it without worrying if you're claiming it's a file (on Unix at least). And many of the other APIs (specifically common APIs) would also easily be able to do the same on Unix (could be easily emulated on Windows I think). So essentially what I'm asking for is some kind of `FileSystemEntry.Delete(path)` (and other common APIs), whether it's a new set of APIs or just allowing the `File` class to do this. Thanks for considering this and reading to here :)
Author: hamarb123
Assignees: -
Labels: `area-System.IO`, `untriaged`
Milestone: -
Symbai commented 2 years ago

53577

hamarb123 commented 2 years ago

53577

Thanks. That issue is different to this one, this one is mainly about having a unified API for common methods for files and directories, with symlinks being a specific example as to why it is needed (IMO).

iSazonov commented 2 years ago

This comes us back to question what is File/Directory design intention either processing of object or its target? If the design intention is to process target (that is intuitively) let's design new API for object - symlink itself.

hamarb123 commented 2 years ago

I'm not sure I follow. I'm specifically referring to how you can't be guaranteed to delete (or check if it exists, move it etc.) a symlink that you know exists in the amount of operations it should take since you have to check if it's still a file or now a directory symlink on Unix. It could also be useful in other cases since it probably takes an operation to check if it's a file or directory in File.Delete and Directory.Delete etc. on Unix that can be saved. But I certainly don't disagree with the idea of a symlink type in .NET.

carlossanlop commented 2 years ago

Thanks for opening an issue, @hamarb123. It seems there are multiple things reported in the main description, so allow me to summarize your concerns, and let me know if they were accurate:

A) There is a mismatch in the behavior of Exists between Windows and Unix when the target of a dir symlink does not exist. B) To delete a file or directory, you have to check if they exist, then delete them if they do. C) To check if a symlink exists, we should just need to use File.Exists or FileInfo.Exists, independently of the target.

Below is some code I tested in Windows and Ubuntu.

Windows

FileInfo ```cs var file = new FileInfo("file.txt"); var link = new FileInfo("symlink"); file.Create().Dispose(); link.CreateAsSymbolicLink("file.txt"); file.Exists // True link.Exists // True file.Delete(); file.Exists // False link.Exists // True file.Delete(); // Succeeds ```
File ```cs File.Create("file.txt").Dispose(); File.CreateSymbolicLink("link", "file.txt"); File.Exists("file.txt") // True File.Exists("link") // True File.Delete("file.txt"); File.Exists("file.txt") // False File.Exists("link") // True File.Delete("file.txt"); // Succeeds ```
DirectoryInfo ```cs var dir = new DirectoryInfo("dir"); var link = new DirectoryInfo("link"); dir.Create(); link.CreateAsSymbolicLink("dir"); dir.Exists // True link.Exists // True dir.Delete(); dir.Exists // False link.Exists // True dir.Delete(); // Throws DirectoryNotFoundException ```
Directory ```cs Directory.CreateDirectory("dir"); Directory.CreateSymbolicLink("link", "dir"); Directory.Exists("dir") // True Directory.Exists("link") // True Directory.Delete("dir"); Directory.Exists("dir") // False Directory.Exists("link") // TRUE (expected behavior) Directory.Delete("dir"); // Throws DirectoryNotFoundException ```

Unix

FileInfo ```cs var file = new FileInfo("file.txt"); var link = new FileInfo("symlink"); file.Create().Dispose(); link.CreateAsSymbolicLink("file.txt"); file.Exists // True link.Exists // True file.Delete(); file.Exists // False link.Exists // True file.Delete(); // Succeeds ```
File ```cs File.Create("file.txt").Dispose(); File.CreateSymbolicLink("link", "file.txt"); File.Exists("file.txt") // True File.Exists("link") // True File.Delete("file.txt"); File.Exists("file.txt") // False File.Exists("link") // True File.Delete("file.txt"); // Succeeds ```
DirectoryInfo ```cs var dir = new DirectoryInfo("dir"); var link = new DirectoryInfo("link"); dir.Create(); link.CreateAsSymbolicLink("dir"); dir.Exists // True link.Exists // True dir.Delete(); dir.Exists // False link.Exists // True dir.Delete(); // Throws DirectoryNotFoundException ```
Directory (problem) ```cs Directory.CreateDirectory("dir"); Directory.CreateSymbolicLink("link", "dir"); Directory.Exists("dir") // True Directory.Exists("link") // True Directory.Delete("dir"); Directory.Exists("dir") // False Directory.Exists("link") // FALSE (Mismatch with Windows, unexpected behavior) Directory.Delete("dir"); // Throws DirectoryNotFoundException ```
hamarb123 commented 2 years ago

@carlossanlop, in those tests, what happens with running the last delete on the link. This is what my issue is (you've got it showing deleting the source again, not sure if this is a mistake or not).

eg. end of Directory:

Directory.Exists("dir");
Directory.Exists("link"); //mismatch

Directory.Delete("link"); // This is what I'm interested in
carlossanlop commented 2 years ago

you've got it showing deleting the source again, not sure if this is a mistake or not

I did it on purpose to show that deleting a directory when it doesn't exist, throws exception, but deleting a file when it doesn't exist, does not throw any exception (point B).

Here's an updated set of tests, which include deleting the link after the target was deleted.

Windows

FileInfo ```cs var file = new FileInfo("file.txt"); var link = new FileInfo("symlink"); file.Create().Dispose(); link.CreateAsSymbolicLink("file.txt"); file.Exists // True (Expected) link.Exists // True (Expected) file.Delete(); // Succeeds (Expected) file.Exists // False (Expected) link.Exists // True (Expected) link.Delete(); // Succeeds (Expected) file.Exists // False (Expected) link.Exists // False (Expected) file.Delete(); // File did not exist, but doesn't throw (Expected) ```
File ```cs File.Create("file.txt").Dispose(); File.CreateSymbolicLink("link", "file.txt"); File.Exists("file.txt") // True (Expected) File.Exists("link") // True (Expected) File.Delete("file.txt"); // Succeeds (Expected) File.Exists("file.txt") // False (Expected) File.Exists("link") // True (Expected) File.Delete("link"); // Succeeds (Expected) File.Exists("file.txt") // False (Expected) File.Exists("link") // False (Expected) File.Delete("file.txt"); // File did not exist, but doesn't throw (Expected) ```
DirectoryInfo ```cs var dir = new DirectoryInfo("dir"); var link = new DirectoryInfo("link"); dir.Create(); link.CreateAsSymbolicLink("dir"); dir.Exists // True (Expected) link.Exists // True (Expected) dir.Delete(); // Succeeds (Expected) dir.Exists // False (Expected) link.Exists // True (Expected) link.Delete(); // Succeeds (Expected) dir.Exists // False (Expected) link.Exists // False (Expected) dir.Delete(); // Dir did not exist, throws DirectoryNotFoundException (Expected) ```
Directory ```cs Directory.CreateDirectory("dir"); Directory.CreateSymbolicLink("link", "dir"); Directory.Exists("dir") // True (Expected) Directory.Exists("link") // True (Expected) Directory.Delete("dir"); // Succeeds (Expected) Directory.Exists("dir") // False (Expected) Directory.Exists("link") // True (Expected) Directory.Delete("link"); // Succeeds (Expected) Directory.Exists("dir") // False (Expected) Directory.Exists("link") // False (Expected) Directory.Delete("dir"); // Dir did not exist, throws DirectoryNotFoundException (Expected) ```

Unix

FileInfo ```cs var file = new FileInfo("file.txt"); var link = new FileInfo("symlink"); file.Create().Dispose(); link.CreateAsSymbolicLink("file.txt"); file.Exists // True (Expected) link.Exists // True (Expected) file.Delete(); // Succeeds (Expected) file.Exists // False (Expected) link.Exists // True (Expected) link.Delete(); // Succeeds (Expected) file.Exists // False (Expected) link.Exists // False (Expected) file.Delete(); // File did not exist, but doesn't throw (Expected) ```
File ```cs File.Create("file.txt").Dispose(); File.CreateSymbolicLink("link", "file.txt"); File.Exists("file.txt") // True (Expected) File.Exists("link") // True (Expected) File.Delete("file.txt"); // Succeeds (Expected) File.Exists("file.txt") // False (Expected) File.Exists("link") // True (Expected) File.Delete("link"); // Succeeds (Expected) File.Exists("file.txt") // False (Expected) File.Exists("link") // False (Expected) File.Delete("file.txt"); // File did not exist, but doesn't throw (Expected) ```
DirectoryInfo ```cs var dir = new DirectoryInfo("dir"); var link = new DirectoryInfo("link"); dir.Create(); link.CreateAsSymbolicLink("dir"); dir.Exists // True (Expected) link.Exists // True (Expected) dir.Delete(); // Succeeds (Expected) dir.Exists // False (Expected) link.Exists // True (Expected) link.Delete(); // PROBLEM: link exists, but throws DirectoryNotFoundException dir.Exists // False (Expected) link.Exists // True (Delete above did not delete the link) dir.Delete(); // Dir did not exist, throws DirectoryNotFoundException (Expected) ```
Directory ```cs Directory.CreateDirectory("dir"); Directory.CreateSymbolicLink("link", "dir"); Directory.Exists("dir") // True (Expected) Directory.Exists("link") // True (Expected) Directory.Delete("dir"); // Succeeds (Expected) Directory.Exists("dir") // False (Expected) Directory.Exists("link") // PROBLEM: False (Mismatch with Windows, unexpected behavior) Directory.Delete("link"); // PROBLEM: link exists, but throws DirectoryNotFoundException Directory.Exists("dir") // False (Expected) Directory.Exists("link") // True (Delete above did not delete the link) Directory.Delete("dir"); // Dir did not exist, throws DirectoryNotFoundException (Expected) ```
hamarb123 commented 2 years ago

A) There is a mismatch in the behavior of Exists between Windows and Unix when the target of a dir symlink does not exist.

Yes, I'd like an API that tells me if a path exists - this could just be a new API like File.Exists(string path, bool includeDirectories) (and for FileInfo too, perhaps it could also have a variant like File.Exists(string path, bool includeDirectories, out bool wasFile and/or some indication it was a link). It might be better for this possible new API to have a struct defining what to include, ie. yes/no for normal files, yes/no for normal directories, yes/no for reparse points (I'm counting reparse points as not being normal files or folders in terms of the inclusion here) and use that as the argument instead of includeDirectories, and then instead of wasFile, it could return whether it was a each of a file, directory and a reparse point with another flags enum. It may be useful to add this variant to some sort of enumerate function as well.

B) To delete a file or directory, you have to check if they exist, then delete them if they do.

I thought I had to do this, but thanks to your code examples, it seems like I can just use the correct one on Windows and always use FileInfo for symlinks on Unix as long as I don't need to check Exists (this will result in a nice performance boost for symlinks since I don't have to check it constantly 😁).

C) To check if a symlink exists, we should just need to use File.Exists or FileInfo.Exists, independently of the target.

Yes, the directory ones should also always return whether the link exists or not too IMO since the symlinks can be returned as a DirectoryInfo, or be listed in EnumerateDirectories (I think). ie. what I'm suggesting is possibly that File.Exists, FileInfo.Exists, Directory.Exists and DirectoryInfo.Exists could just return true if it's a symlink that exists on Unix - but this could be avoided with the API suggestion above (assuming the API is also on DirectoryInfo).

hamarb123 commented 2 years ago

In my project, I'm now using FileInfo for all my symlinks and it works well. The only issue is obviously the Exists check, luckily in most of my code existence is assumed and exceptions are all handled, so there were only a few cases where I had to update that logic. I do still think that Exists should be changed/added to though.