dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.15k stars 4.71k forks source link

Request: a way to tell if two filepaths point to the same file #17873

Open ljw1004 opened 8 years ago

ljw1004 commented 8 years ago

On unix, the recommended way to tell whether two filepaths point to the same file is to use stat and compare st_dev, st_ino.

http://www.boost.org/doc/libs/1_53_0/libs/filesystem/doc/reference.html#equivalent

On Windows, the recommended way is to use the win32 API GetFileInformationByHandle and compare nFileIndexLow, nFileIndexHigh, dwVolumeSerialNumber. You have to do this via p/invoke because there's no .NET wrapper for it.

P/invoke was tolerable to me because it was all self-contained and easy. But as I port my code over to corefx, I honestly can't be bothered to go the whole nuget platform-specific native binary route. It's far too much work.

I'd love it you could add an API to corefx to judge whether two filenames point to the same file. Judging by the huge number of requests for this on stackoverflow, it seems like a common bread-and-butter scenario.

karelz commented 8 years ago

Related to dotnet/runtime#14321

We need API proposal

JeremyKuhne commented 8 years ago

Perhaps

bool System.IO.Path.HaveSameTarget(string path1, string path2)

Behavior:

We may also want to consider exposing this for existing SafeFileHandles. Perhaps bool SafeFileHandle.HasSameTarget(SafeFileHandle fileHandle).

danmoseley commented 7 years ago

@JeremyKuhne, @ianhays do you consider this ready for API review?

am11 commented 7 years ago

Please consider both APIs: checking the equality based on real-path and getting the real-path for given path:

public static class Directory
{
    // resolves symlink etc.
    public static string GetRealPath(string path);

    // resolves symlink and performs equality check
    public static bool HaveSameTarget(string path1, string path2);
}

public static class File
{
    // resolves symlink etc.
    public static string GetRealPath(string path);

    // resolves symlink and performs equality check
    public static bool HaveSameTarget(string path1, string path2);
}
ljw1004 commented 7 years ago

What exactly is the definition of "Real Path"? You wrote "resolves symlinks etc" but there's a lot buried in that...

I think the concept of a "real path" is far too woolly. It certainly can't be used as the implementation technique for HaveSameTarget. The best you could do is have a function called "ResolveSymlinks" (but I think that API is questionable -- the reason users use symlinks is because they want applications to see a particular directory structure, and they don't want applications second-guessing them).

ianhays commented 7 years ago

Before we mark this as ready for review we should update the issue with a more concrete API proposal including a finalized API, some code examples, and justification for real-world usage. I think Jeremy's post is on the right track for desired behavior, we just need something more formal before we should move forward.

am11 commented 7 years ago

there's a lot buried in that

Agree on that. I am not an expert but I think the API can define reasonable constraints. I thought HaveSameTarget can make use of GetRealPath under the hood.

I think that API is questionable

On the contrary, I think there really is a need of parsing the real path in .NET (especially Core) as it is provided by almost all other language frameworks.

PHP http://php.net/manual/en/function.realpath.php Perl http://perldoc.perl.org/Cwd.html Python https://docs.python.org/2/library/os.path.html#os.path.realpath Ruby http://apidock.com/ruby/Pathname/realpath node.js https://nodejs.org/api/fs.html#fs_fs_realpath_path_options_callback

Symlinks are used more commonly on Unix systems and hence more pain/complaints from that side of the house: dotnet/coreclr#2128.

On .NET, even the Windows-only P/Invoke based solutions such as this one are quite complicated to implement IMO.

ljw1004 commented 7 years ago

"I thought HaveSameTarget can make use of GetRealPath under the hood." -- no it can't! I gave examples where the correct solution (inode &c.) will correctly claim that two files are the same, but GetRealPath (as defined in PHP and Python and nodejs) will claim they're different.

JeremyKuhne commented 7 years ago

I'll chime in more later when I have more time. I'll point out dotnet/runtime#14321 again as it discusses overlapping issues.

One initial thought is that we have to think about multiple drivers (such as mounted network shares) in how we define anything. I don't know that it is possible to say that two files definitively aren't the same if they don't share the same actual device (e.g. \Device\Harddisk0\ on Windows).

JeremyKuhne commented 4 years ago

This issue has come up in other forums such as Twitter/StackOverflow

hamarb123 commented 2 years ago

I would like this api also.