dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.98k stars 4.66k forks source link

Proposed API for symbolic links #24271

Closed carlreinke closed 3 years ago

carlreinke commented 6 years ago

Edit by @carlossanlop: Revisited API Proposal



Original proposal:

Rationale

The ability to interact with symbolic links (symlinks) in .NET is currently limited to determining that a file is ReparsePoint. This proposed API provides the ability to identify, read, and create symbolic links.

Proposed API

public class Directory
{
    public static DirectoryInfo CreateSymbolicLink(string linkPath, string targetPath);
    public static string GetSymbolicLinkTargetPath(string linkPath);
    public static bool IsSymbolicLink(string path);
}

public class File
{
    public static FileInfo CreateSymbolicLink(string linkPath, string targetPath);
    public static string GetSymbolicLinkTargetPath(string linkPath);
    public static bool IsSymbolicLink(string path);
}

public class FileSystemInfo
{
    public bool IsSymbolicLink { get; }
    public string SymbolicLinkTargetPath { get; }

    public void CreateSymbolicLink(string linkPath);
}

Details

The path returned from GetSymbolicLinkTargetPath(string)/SymbolicLinkTargetPath will be returned exactly as it is stored in the symbolic link. It may reference a non-existent file or directory.

For the purposes of this API, NTFS Junction Points are considered to be like Linux bind mounts and are not considered to be symbolic links.

Updates

iSazonov commented 3 years ago

I did make attempts to design an API that fit into the existing FileInfo/DirectoryInfo stuff.

Great! @jhudsoncedaron Thanks for your efforts! I am happy to see that this is possible. If you have managed to adapt most of the API, then this I suppose is a good confirmation that this is the right way to design the API. For me it is still an open question whether to look for a common abstraction for Windows reparse points. I think it's worth a try.

As far as performance for enumeration is concerned, these are implementation details. I believe that performance loss can be avoided for mainstream scenarios. It is not a question for the API design time. It is also worth enhancing the enumeration options on how to handle symlinks (follow or no).

jhudsoncedaron commented 3 years ago

@iSazonov : "As far as performance for enumeration is concerned, these are implementation details." Only if you're willing to to pay one disk access per file returned from the API as the immediate caller asks "is it a file, directory, or symbolic link" of every string returned from GetFileSystemEntries. That was the cost that was too high to pay. Note that right now (in the non-symbolic link case), a caller calling Directory.GetFiles and Directory.GetDirectories doesn't have to pay it, but a caller calling GetFileSystemEntries does making it actually faster to union the two together than to call GetFileSystemEntries in most cases. The API surface area mandates a non-performant implementation.

iSazonov commented 3 years ago

"As far as performance for enumeration is concerned, these are implementation details." Only if you're willing to to pay one disk access per file returned from the API as the immediate caller asks "is it a file, directory, or symbolic link" of every string returned from GetFileSystemEntries.

@jhudsoncedaron It is how the enumeration API already work. All the methods use the same InternalEnumeratePaths() https://source.dot.net/#System.IO.FileSystem/System/IO/Directory.cs,169 and InternalEnumerateInfos() https://source.dot.net/#System.IO.FileSystem/System/IO/DirectoryInfo.cs,174 methods. You could look how PowerShell 7 works with symlinks. (There is additional p/invoke call to get a target but in enumeration PowerShell follows .Net API).

jhudsoncedaron commented 3 years ago

@iSazonov : lol I must be the only person who noticed the perf drain, but I must have reached for the native hammer too quickly and ended up with a bad idea of how .GetFiles() works.

heng-liu commented 3 years ago

Hi @carlossanlop, thanks for prioritizing this task! May I know if there is any plan/estimation about this work? (e.g. likely to release in which preview version?) NuGet has an issue and the fix needs to handle the symbolic links: https://github.com/dotnet/runtime/issues/24271#issuecomment-786940658 It would be great if this API is implemented so that we don't have to use P/Invoke. Thanks!

carlossanlop commented 3 years ago

Summary

The ability to interact with symbolic links in .NET is currently limited to determining that a file has the ReparsePoint attribute, but we do not yet offer APIs for creating symbolic links, or for accessing the linked file or directory.

Proposed APIs

public abstract class FileSystemInfo
{
    public void CreateAsSymbolicLink(ReadOnlySpan<char> pathToTarget);
    // In case of chained links, final target should be returned when `true`. It should throw if it detects cycles.
    public FileSystemInfo? GetTargetInfo(bool returnFinalTarget); 
}

Alternative design

public class FileInfo
{
    public void CreateAsSymbolicLink(ReadOnlySpan<char> pathToTarget);
    public FileInfo? GetTargetInfo(bool returnFinalTarget); 
}

public class DirectoryInfo
{
    public void CreateAsSymbolicLink(ReadOnlySpan<char> pathToTarget);
    public DirectoryInfo? GetTargetInfo(bool returnFinalTarget); 
}

Future expansions to keep in mind

Not needed right now - Although they are outside of the scope of this API proposal, we wanted to make sure we could easily expand `FileSystemInfo` to support the creation of junctions and hard links: ```cs public class DirectoryInfo { // Can reuse `GetTargetInfo` to retrieve the junction target public void CreateJunction(ReadOnlySpan pathToTarget); } public class FileInfo { public void CreateHardLink(ReadOnlySpan pathToTarget); } ```

Usage cases

```cs ///////////////////////// //// Symlink to a file // link2a \ // ---> link1 -> file.txt // link2b / var file = new FileInfo("/path/file.txt"); file.Create().Dispose(); var link1 = new FileInfo("/path/link1"); link1.CreateAsSymbolicLink(file.FullPath); FileSystemInfo target1 = link1.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target1.FullPath); // /path/file.txt var link2a = new FileInfo("/path/link2a"); link2a.CreateAsSymbolicLink(link1.FullPath); // By default skips links in between and returns final target. FileSystemInfo target2a = link2a.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target2a.FullPath); // /path/file.txt var link2b = new FileInfo("/path/link2b"); link2b.CreateAsSymbolicLink(targetPath: link1.FullPath); // Won't skip link1, will return it as the target. FileSystemInfo target2b = link2b.GetTargetInfo(returnFinalTarget: false); Console.WriteLine(target2b.FullPath); // /path/link1 ///////////////////////// //// Symlink to a directory // link2a \ // ---> link1 -> directory // link2b / var directory = new DirectoryInfo("/path/directory"); directory.Create(); // The symlink itself needs to be represented with a DirectoryInfo instance // because Windows cares about the underlying type var link1 = new DirectoryInfo("/path/link1"); link1.CreateAsSymbolicLink(targetPath: directory.FullPath); // No need to follow to final target, it's direct FileSystemInfo target1 = link2b.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target1.FullPath); // /path/directory var link2a = new DirectoryInfo("/path/link2a"); link2a.CreateAsSymbolicLink(link1.FullPath); // Skips link1 and returns final target FileSystemInfo target2a = link2a.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target2a.FullPath); // /path/directory var link2b = new DirectoryInfo("/path/link2b"); // Won't skip link1, will return link1 as the target link2b.CreateAsSymbolicLink(link1.FullPath); FileSystemInfo target2b = link2b.GetTargetInfo(returnFinalTarget: false); Console.WriteLine(target2b.FullPath); // /path/link1 ///////////////////////// //// Non-existent target var link1 = new FileInfo("/path/link1"); // Should succeed to create symlink file, even though target does not exist link1.CreateAsSymbolicLink("/non/existent/file.txt"); FileSystemInfo target1 = link1.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target1.FullPath); // Should print /non/existent/file.txt var link2 = new FileInfo("/path/link2"); link2.CreateAsSymbolicLink(targetPath: link2.FullPath); // skips link1 // Follows symlinks and stops at file.txt, even if it does not exist FileSystemInfo target2 = link2.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target2.FullPath); // Should print /non/existent/file.txt ///////////////////////// //// Existing symlink var directory = new DirectoryInfo("/path/directory"); directory.Create(); var link = new DirectoryInfo("/path/link"); Link.CreateSymbolicLink(targetPath: directory.FullPath); // This DirectoryInfo wraps the symlink that was created above // so we should return a valid TargetInfo when requested var existingLink = new DirectoryInfo("/path/link"); FileSystemInfo existingTarget = existingLink.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(existingTarget.FullPath); // Should print /path/directory ///////////////////////// // Inconsistent symlink target and *Info type var directory = new DirectoryInfo("/path/directory"); directory.Create(); // The user should've used DirectoryInfo to wrap the link to a directory var link = new FileInfo("/path/link"); Link.CreateSymbolicLink(targetPath: directory.FullPath); // Should throw because target is a directory ///////////////////////// // Circular reference var link1 = new FileInfo("/path/link1"); link1.CreateAsSymbolicLink(targetPath: "/path/link2"); var link2 = new FileInfo("/path/link2"); link2.CreateAsSymbolicLink(targetPath: "/path/link3"); var link3 = new FileInfo("/path/link3"); link3.CreateAsSymbolicLink(targetPath: "/path/link1"); // Throws because we opted-in to follow symlinks and there is a cycle. // and a circular reference is found on link3 to link1 FileSystemInfo target3 = link3.GetTargetInfo(returnFinalTarget: true); ///////////////////////// // Recursive enumeration directory with symlinks // directory // - subdirectory1 // - file.txt // - symlink1 -> file.txt // - subdirectory2 // - symlink2 -> symlink1 FileSystemEnumerable.FindTransform transform = (ref FileSystemEntry entry) => entry.ToFileSystemInfo(); EnumerationOptions options = new EnumerationOptions { RecurseSubdirectories = true }; var enumerable = new FileSystemEnumerable(@"/path/to/directory", transform, options) { ShouldRecursePredicate = (ref FileSystemEntry entry) => entry.IsDirectory }; foreach (FileSystemInfo info in enumerable) { // No need to add an API to signal that a FSI is Symbolic Link. // Warning: ReparsePoint is not exclusive of Symbolic Links. string path = info.Attributes.HasFlag(FileAttributes.ReparsePoint) ? info.GetTargetInfo(returnFinalTarget: true).FullPath : info.FullPath; Console.WriteLine(path); } ```

Optional additional design

As initially proposed in this discussion, we also made sure to consider the expansion of the existing File and Directory static classes, with some differences.

These additional APIs could be considered alternative or additional to the proposed above.

public static class File
{
    static FileInfo CreateSymbolicLink(ReadOnlySpan<char> path, ReadOnlySpan<char> pathToTarget);
    static bool IsSymbolicLink(ReadOnlySpan<char> path);
    static FileInfo GetSymbolicLinkTarget(ReadOnlySpan<char> linkPath);
}

public static class Directory
{

    static DirectoryInfo CreateSymbolicLink(ReadOnlySpan<char> path, ReadOnlySpan<char> pathToTarget);
    static bool IsSymbolicLink(ReadOnlySpan<char> path);
    static DirectoryInfo GetSymbolicLinkTarget(ReadOnlySpan<char> linkPath);
}

// We could also add them to the static Path class
// but need to return FileSystemInfo instances
public static class Path
{
    static FileSystemInfo CreateSymbolicLink(ReadOnlySpan<char> path, ReadOnlySpan<char> pathToTarget);
    static bool IsSymbolicLink(ReadOnlySpan<char> path);
    static FileSystemInfo GetSymbolicLinkTarget(ReadOnlySpan<char> linkPath);
}

Future expansions to keep in mind

Not needed right now - Similarly to the main proposal, we made sure to keep in mind the potential future addition of junction and hard link creation support. ```cs public static class File { // Future static FileInfo CreateHardLink(ReadOnlySpan path, ReadOnlySpan pathToTarget); } public static class Directory { // Future // The user could consume `FileSystemInfo.GetTargetInfo` to retrieve the junction target static DirectoryInfo CreateJunction(ReadOnlySpan path, ReadOnlySpan pathToTarget); } ```

Alternative design usage cases

```cs ///////////////////////// //// Symlink to a file // link2a \ // ---> link1 -> file.txt // link2b / var file = new FileInfo("/path/file.txt"); file.Create().Dispose(); FileInfo link1 = File.CreateSymbolicLink(path: "/path/link1", targetPath: file.FullPath) as FileInfo; FileSystemInfo target1 = link1.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target1.FullPath); // /path/file.txt FileInfo link2a = File.CreateSymbolicLink(path: "/path/link2a", targetPath: link1.FullPath); // Skips link1 and returns final target FileSystemInfo target2a = link2a.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target2a.FullPath); // /path/file.txt FileInfo link2b = File.CreateSymbolicLink(path: "/path/link2b", targetPath: link1.FullPath); // Won't skip link1, will return link1 as the target FileSystemInfo target2b = link2b.GetTargetInfo(returnFinalTarget: false); Console.WriteLine(target2b.FullPath); // /path/link1, instead of /path/file.txt Console.WriteLine(File.IsSymbolicLink(link1.FullPath)); // true FileInfo target = File.GetSymbolicLinkTargetInfo(link1.FullPath); Console.WriteLine(target.FullPath); // /path/file.txt ///////////////////////// //// Symlink to a directory // link2a \ // ---> link1 -> directory // link2b / var directory = new DirectoryInfo("/path/directory"); directory.Create(); // The symlink itself needs to be represented with a DirectoryInfo instance // because Windows cares about the underlying type DirectoryInfo link1 = Directory.CreateSymbolicLink(path: "/path/link1", targetPath: directory.FullPath); FileSystemInfo target1 = link2b.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target1.FullPath); // /path/directory DirectoryInfo link2a = Directory.CreateSymbolicLink(path: "/path/link2a", targetPath: link1.FullPath); // Skips link1 and returns final target FileSystemInfo target2a = link2a.GetTargetInfo(returnFinalTarget: true); Console.WriteLine(target2a.FullPath); // /path/directory DirectoryInfo link2b = Directory.CreateSymbolicLink(path: "/path/link2b", targetPath: link1.FullPath); // Won't skip link1, will return link1 as the target FileSystemInfo target2b = link2b.GetTargetInfo(returnFinalTarget: false); Console.WriteLine(target2b.FullPath); // /path/link1 Console.WriteLine(Directory.IsSymbolicLink(link1.FullPath)); // true DirectoryInfo target = Directory.GetSymbolicLinkTargetInfo(link1.FullPath); Console.WriteLine(target.FullPath); // /path/directory ```

Notes and questions

cc @Jozkee @jeffhandley

iSazonov commented 3 years ago

Notice, PowerShell 7 doesn't follow symlinks by default because of possible cycles. There was implemented dir -FollowSymlink and internal cycle tracking (that is expensive).

This behavior matches the OS API behavior (FindFirstFileEx) https://docs.microsoft.com/en-us/windows/win32/fileio/symbolic-link-effects-on-file-systems-functions


PowerShell has IsReparsePointLikeSymlink() method. It is static.


I'd prefer names: CreateAsSymbolicLink -> CreateSymbolicLink GetTargetInfo -> GetTarget


Perhaps it is more safe to design full API set for manipulating symlinks and hardlinks

jhudsoncedaron commented 3 years ago

@iSazonov : That's one of the two reasons I used OS facilities for that purpose. The other one being that loadable filesystems don't have to make readlink() and followling() behave anywhere near similar to each other.

adamsitnik commented 3 years ago

My only complain about the proposed design is that it requires me to allocate a new instance of a class to perform any symlink action. Have we considered a simpler API like the following one:

namespace System.IO
{
    public static class SymbolicLink
    {
        public static void Create(ReadOnlySpan<char> target, ReadOnlySpan<char> link); // wrapper for symlink()
        public static string GetTarget(ReadOnlySpan<char> link); // wrapper for readlink()
        public static void Remove(ReadOnlySpan<char> link); // wrapper for remove()
    }

    public static class Path
    {
        public static bool IsSymbolicLink(ReadOnlySpan<char> path); // wrapper for fstat(AT_SYMLINK_NOFOLLOW)
    }
}

edit: string -> ReadOnlySpan<char>

iSazonov commented 3 years ago

it requires me to allocate a new instance of a class to perform any symlink action.

IO.Directory and IO.File has static methods, IO.DirectoryInfo and IO.FileInfo has non-static ones. I believe it makes sense to fiollow this too.

terrajobst commented 3 years ago

Agreed. Also, IO is already fairly expensive; this isn't the kind of operation one does in a tight loop, so I'm not sure allocations would end up mattering here.

Do we believe we never need additional information on symbolic links, like a kind? If we do, we should consider introducing a new type like SymbolicLink that's returned from CreateSymbolicLink.

carlossanlop commented 3 years ago

I updated the proposal with the static methods.

jhudsoncedaron commented 3 years ago

Do we believe we never need additional information on symbolic links, like a kind?

You do. Consider the recursive copy tree method.

bartonjs commented 3 years ago

Video

namespace System.IO
{
    public abstract partial class FileSystemInfo
    {
        public void CreateAsSymbolicLink(string pathToTarget);
        public FileSystemInfo? GetSymbolicLinkTarget(bool returnFinalTarget = false); 
    }
    public static class File
    {
        public static FileSystemInfo CreateSymbolicLink(string path, string pathToTarget);
        public static FileSystemInfo? GetSymbolicLinkTarget(string linkPath, bool returnFinalTarget = false);
    }
    public static class Directory
    {
        public static FileSystemInfo CreateSymbolicLink(string path, string pathToTarget);
        public static FileSystemInfo? GetSymbolicLinkTarget(string linkPath, bool returnFinalTarget = false);
    }
}
KevinCathcart commented 3 years ago

Some notes I made during the review:

On .NET5 on Windows, directory Symlinks are always treated as directories by existing .NET APIs. They are returned in EnumerateDirectories. This applies if the target exists, target does not exist, or target exists but is wrong type.

The converse with EnumerateFiles applies for regular symbolic links. I'm pretty sure this difference in behavior is the entire reason for windows to even differentiate the two types. It lets windows not even need to try to resolve the link in order to understand if it is file or dir, since with things like UNC targeted links this could be rather expensive.

On Windows if you create a FileInfo but provide a path for a directory symlink Exists will return false. (or vise versa). However in this scenario most other properties will be populated, except things like Length will throw FileNotFound.

On Unix-like platforms (where there is only one type of symlink), the existing APIs in .NET 5 generally treat symlinks whose target is a file or non-existent as files, and links whose target is a directory is treated as a directory.


So I hope the create methods when the target does not exist will allow the user to chose the type, based on which static class they use, or the type of FileSystemInfo being used if instance method is used. This ability really does matter (only on Windows), since if you are creating a symlink in anticipation of later creating the target, if you create the wrong type you won't have the desired result. Thankfully the approved API does allow for this differentiation. Of course, when the file already exists, I would prefer if the static methods just did the right thing.

iSazonov commented 3 years ago
  • In review we were concerned with how this might affect existing behaviors of things like DirectoryInfo.EnumerateDirectories (does that currently examine if symbolic links are directories?) and since we weren't sure what the answers were we had trouble continuing.

DirectoryInfo.Enumerate* methods resolve symbolic links to targets.

iSazonov commented 3 years ago

Do we believe we never need additional information on symbolic links, like a kind?

PowerShell does this. For ex., Get-ChildItem (aka dir) show l in Mode for symlinks:

dir C:\tmp\1

    Directory: C:\tmp\1

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
l----          25.02.2021    13:07                link-alpha -> c:\tmp\1\sub-alpha
l----          25.02.2021    14:28                link-alpha2 -> c:\tmp\1\sub-alpha2
d----          25.02.2021    14:27                sub-alpha
d----          25.02.2021    18:39                sub-alpha2
d----          25.02.2021    13:06                sub-omega

So I am saddened that IsSymbolicLink() fell out of the final proposal.

ericstj commented 3 years ago

Not sure it was discussed, but since the APIs don't mention anything about relative vs absolute I would expect these APIs to handle any relative paths in the symlinks and resolve them correctly rather than ever exposing the relative portion, so as not to put the onus on the caller to understand how to resolve a relative symlink. I'd also expect CreateSymbolicLink to understand relative vs absolute paths. When static methods are given a relative target, they will create a relative symlink (and ensure it is relative to path).

bartonjs commented 3 years ago

So I am saddened that IsSymbolicLink() fell out of the final proposal.

internal static bool IsSymbolicLink(this FileSystemInfo info)
{
    return GetSymbolicLinkTarget(info) != null;
}

We removed it because it felt like it was going to be "if true, call GetSymbolicLinkTarget", which would be a double-hit anti-pattern.

ericstj commented 3 years ago

@bartonjs, so folks who only care if something is a symlink they can do GetSymbolicLinkTarget(info) != null as you suggest.

Did you consider bool TryGetSymbolicLinkTarget(string linkPath, out FileSystemInfo info, bool returnFinalTarget = false)? That might make it more clear that this was the case.

carlreinke commented 3 years ago

It seems that with the approved API shape it will not be possible to tell the difference between relative and absolute symlinks, unless there's some magic in FileSystemInfo that I'm not aware of.

jhudsoncedaron commented 3 years ago

So I gathered. I guess the authors haven't used magic symbolic links yet (where the link target doesn't exist but rather the path stores a small amount of information). They're popular for locks because the symlink() api has better atomicity than most other things.

bartonjs commented 3 years ago

Did you consider [the try pattern]

I don't believe anyone mentioned the try pattern during the meeting. Since I can't think of any ambiguity in a null return I don't personally see it as necessary, but wouldn't object if that proposal came through.

If you think someone might have an "expectation of success", even though most file system entries are something other than symlinks (blind assertion), then converting it to the try pattern would make sense.

ericstj commented 3 years ago

unless there's some magic in FileSystemInfo that I'm not aware of.

I guess the authors haven't used magic symbolic links yet (where the link target doesn't exist but rather the path stores a small amount of information). They're popular for locks because the symlink() api has better atomicity than most other things.

Interesting use case. Perhaps OriginalPath and ToString() would work here to expose the relative path. I believe that's how folks can observe relative paths passed into FileSystemInfo's

mklement0 commented 3 years ago

@ericst, OriginalPath is protected.

ericstj commented 3 years ago

Right, it can be set as OriginalPath when initializing the FSI returned and exposed through ToString(). This is already the behavior of these WRT relative paths. It's not perfect in terms of API, but could work if we documented it and would avoid modifying the abstraction or adding new API. Just a suggestion.

jhudsoncedaron commented 3 years ago

@mklement0 : Reflection.Emit.DynamicMethod() doesn't care.

carlossanlop commented 3 years ago

it will not be possible to tell the difference between relative and absolute symlinks

When you create a FileInfo/DirectoryInfo with a relative path passed to the constructor, the path has to be relative to some path if you actually intend to create it.

The CreateAsSymbolicLink method will write a file to disk, so you should be using a full path for that. The pathToTarget can be a relative path:

var linkInfo = new FileInfo("/home/carlos/link");
linkInfo.CreateAsSymbolicLink(pathToTarget: "relative/path/to/file.txt"); // Should succeed
// This should return a FileInfo where the directory is relative to the original linkInfo's directory
var targetInfo = linkInfo.GetSymbolicLinkTarget();
Console.WriteLine(targetInfo!.FullName); // Should print /home/carlos/relative/path/to/file.txt

// Let's move the link to a different folder, while the target remains in the same place
var newDirInfo = new DirectoryInfo("/home/carlos/newdir");
newDirInfo.Create();
string newPath = Path.Combine(newDirInfo.FullName, linkInfo.Name);
File.Move(linkInfo.FullName, newPath); // home/carlos/newdir/link

// Since the path was relative, the target does not exist now
var linkInfo2 = new FileInfo(newPath);
var newTargetInfo = linkInfo2.GetSymbolicLinkTarget();
Console.WriteLine(newTargetInfo!.Exists); // Should print false
Console.WriteLine(newTargetInfo.FullName); // Should print /home/carlos/newdir/relative/path/to/file.txt

To expand a bit more on the fact that a FileInfo/DirectoryInfo needs to be relative to some path, consider the case where you create an instance without indicating the folder (we default to the assembly location):

var info = new FileInfo("relative.txt");

string assembly = System.Reflection.Assembly.GetEntryAssembly().Location; // C:\YourProject\bin\Debug\net6.0\YourProject.dll
Console.WriteLine(Path.GetDirectoryName(assembly)); // C:\YourProject\bin\Debug\net6.0

string exe = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName; // C:\YourProject\bin\Debug\net6.0\YourProject.exe
Console.WriteLine(Path.GetDirectoryName(exe)); // C:\YourProject\bin\Debug\net6.0

Console.WriteLine(info.DirectoryName); // C:\YourProject\bin\Debug\net6.0

where the link target doesn't exist but rather the path stores a small amount of information

Creating symbolic links for targets that do not yet exist is something we kept in mind. Here's an example, which also exemplifies what @bartonjs explained above:

var linkInfo = new FileInfo("/path/to/link");
linkInfo.CreateAsSymbolicLink("/path/to/nonexistent/file.txt");
FileInfo? targetInfo = linkInfo.GetSymbolicLinkTarget() as FileInfo;
if (targetInfo != null)
{
    if (targetInfo.Exists)
    {
        Console.WriteLine("The target file exists.");
    }
    else
    {
        Console.WriteLine("The target file does not yet exist.");
        targetInfo.Create().Dispose(); // it's a file.txt
    }
}
else
{
    Console.WriteLine("It's null when the symbolic link file itself does not yet exist.");
}
carlreinke commented 3 years ago

@carlossanlop Neither the ability to create relative symlinks nor the existence of the symlink target addresses the ability to determine whether an existing symlink is relative or absolute (or rather, to determine the exact content of the target path).

iSazonov commented 3 years ago

Let's look how PowerShell works.

  1. PowerShell dir -Recurse doesn't follow symlinks by default (but C# enumeration does) to avoid hangs on cycles. Users should do dir -Recurse -FollowSymlink to follow symlinks. To do this, PowerShell has to check if an object is Symlink. (Without -Recurse the dir resolve target) I opened PR to migrate to new C# API to enumerate directory tree and I have again to do this check with additional p/invoke(s).

  2. PowerShell add to FileSystemInfo new properties:

    
    get-Item C:\tmp\1\link-alpha | fl
    
    Directory: C:\tmp\1

Name : link-alpha CreationTime : 25.02.2021 13:07:59 LastWriteTime : 25.02.2021 13:07:59 LastAccessTime : 25.02.2021 13:07:59 Mode : l---- LinkType : SymbolicLink Target : c:\tmp\1\sub-alpha


We can see Mode, LinkType and Target.

I still do not understand how the new API could help to migrate PowerShell to the API.
carlossanlop commented 3 years ago

@iSazonov by looking at your output, but without knowing how PowerShell works internally, you could retrieve LinkType and Target with the APIs approved here:

1) First, you check if a file is a symlink by checking fileInfo.Attributes.HasFlag(FileAttributes.ReparsePoint). Which would give you the answer for this section of your output:

LinkType : SymbolicLink

And after collecting the above piece of information, you could call fileInfo.GetSymbolicLinkTarget() to be able to populate:

Target : c:\tmp\1\sub-alpha

Does that make sense?

Note: I'm not sure about Mode. Is that indicating the file permissions? We don't currently have APIs to retrieve that information, so that would be a separate API proposal.

mklement0 commented 3 years ago

@carlossanlop, an aside re relative paths in System.IO.FileSystemInfo constructors:

(we default to the assembly location)

The default is (fortunately) the process' current directory, not the assembly location.

iSazonov commented 3 years ago

@carlossanlop The new API will certainly do what it is designed to do. However, I have my doubts.

  1. How fast is it? In the example I gave, it looks like a workaround with extra performance costs.
  2. How consistent would this be with other parts like reparse points on Windows and in particular with enumerations? This again requires extra performance costs. The same with extended attributes on Unix.

There is already an unsuccessful example of adding a hidden extended attribute on Unix that adds extra performance costs. As you may remember this is still uncorrected. (My suggestion was to roll this back altogether and implement it "on demand" or "lazy".) I fear that we may fall into a similar trap again.

iSazonov commented 3 years ago

Note: I'm not sure about Mode. Is that indicating the file permissions? We don't currently have APIs to retrieve that information, so that would be a separate API proposal.

Mode: d - directory, a - readonly, h - hidden, s - system and so on.

mklement0 commented 3 years ago

To expand on @carlreinke's concern:

There are three use cases with respect to a symlink's (reparse point's) target:

(a) I just want to know its immediate target, as a full path (.GetSymbolicLinkTarget())

(b) I want to know its ultimate target, with all symlinks resolved (possibly recursively), i.e. the ultimate target's full, canonical path (.GetSymbolicLinkTarget(returnFinalTarget: true)

(c) I want to know the target path exactly as defined in the symlink - which may be a relative or an absolute path.

It seem that proposed API won't support (c), which is problematic.

(Reflecting this information indirectly, via the .ToString() return value of the FileSystemInfo instance returned, as @ericstj has suggested, strikes me as too obscure, but it may be better than not having access to this information at all).

markusschaber commented 3 years ago

Maybe GetSymbolicLinkTarget() should accept an Enum instead of a Boolean as parameter, to enable all three use cases?

iSazonov commented 3 years ago

Short overview PowerShell capabilities:

PowerShell detects file mode type (WinInternalGetLinkType method) for FileSystemInfo object. It is exposed as LinkType property for FileInfo and DirectoryInfo objects.

PowerShell gets a target (WinInternalGetTarget method) for FileSystemInfo object (and for a path in IsWindowsApplication temporary method)

PowerShell can create mount points (WinCreateJunction method)

PowerShell can create symlink (WinCreateSymbolicLink method)

PowerShell can create hardlink (WinCreateHardLink method)


Obviously, these features were added at the request of users as the most popular ones.

The overview forces me to think:

mklement0 commented 3 years ago

@markusschaber, the problem with using an Enum is that you'd still get a FileSystemInfo instance back, and the question again arises how that instance should reflect the requested information.

Perhaps a separate method, GetSymbolicLinkTargetPath, returning a mere string - namely the target path as defined in the reparse point / symlink - is the better solution:

namespace System.IO
{
    public abstract partial class FileSystemInfo
    {
        // ...
        public string GetSymbolicLinkTargetPath(); 
    }
    public static class File
    {
        // ...
        public string GetSymbolicLinkTargetPath(); 
    }
    public static class Directory
    {
        // ...
        public static string GetSymbolicLinkTargetPath(string linkPath); 
    }
}
tmds commented 3 years ago

It would be useful to add some bool followLinks to existing APIs, like new FileInfo(string path, bool followLinks).

This allows the user to leave the following of symbolic links to the framework, and reduces the nr of syscalls made (e.g. stat vs lstat).

mklement0 commented 3 years ago

Taking @iSazonov's feedback into account, I suggest we take a step back and generalize the proposed API:

@tmds, your use case would then be addressed with new FileInfo("foo").GetTarget() (though it wouldn't be a performance improvement).

tmds commented 3 years ago

@mklement0 when a user uses FileInfo do you think in the common case he's interested in the link, or what the link refers to? I think it's the latter. I'd even advocate for it to be the default (though that would be a breaking change).

There may be multiple links involved. The proposed API would require the user to loop until he hits the final target. The links could even be cyclic (for which the kernel would ELOOP).

bartonjs commented 3 years ago

There may be multiple links involved. ... The links could even be cyclic (for which the kernel would ELOOP).

That's the very reason I'd oppose it being the default.

tmds commented 3 years ago

@bartonjs do you think in the common case he's interested in the link, or what the link refers to?

jhudsoncedaron commented 3 years ago

@tmds: In the common case, the programmer does not know.

The change to adding symbolic links in the unix world was made ages ago, and they chose to make the pre-existing API become the API that resolves through the link and the new API the API that accesses the link itself. This lead to a large number of security problems as unixes added symbolic links one by one. Repeat the same decision, repeat the same mistake, and this time magnified because right now the API tells you it's a reparse point and you know not to operate on it.

mklement0 commented 3 years ago

@tmds:

do you think in the common case he's interested in the link

Normally, there's no need for this distinction:

In many contexts, the link is transparently treated like its target, and, similarly, resolving to the ultimate target isn't usually needed.

If you do need to make distinction, call .GetTarget(), optionally with returnFinalTarget: true, if you need to know the canonical path.

Defaulting to transparent redirection to the target is inappropriate, if you're explicitly targeting a file-system item that just happens to be a link.

Symbolic links are often used to act as dynamic pointers, so that, say, a link named current may point to a dir. with a specific version number that at that time happens to be the latest - you definitely don't want to resolve that to, say, 1.0.37.

To illustrate that point, let's take /System/Library/Frameworks/AppKit.framework/Versions/Current on macOS, which is a symlink to - on my machine, as of this writing - System/Library/Frameworks/AppKit.framework/Versions/C:

new DirectoryInfo("/System/Library/Frameworks/AppKit.framework/Versions/Current").EnumerateFileSystemInfos()
tmds commented 3 years ago

In many contexts, the link is transparently treated like its target,

Similarly, targeting a symlink to a file as if it were the target file itself works just fine with the System.IO APIs.

I'm pointing out this is not the case for FileInfo currently, and the proposal doesn't provide it.

mklement0 commented 3 years ago

@tdms, what, specifically, is not the case for FileInfo currently?

tmds commented 3 years ago

The properties returned by FileInfo are providing info about the link, and not the target.

In some (I think: most) cases you're interested in the target, see https://github.com/dotnet/runtime/issues/36091.

mklement0 commented 3 years ago

@tmds, a link is not the same as its target. Conflating the two would mean erasing an underlying file-system concept, which, as discussed, is more than an implementation detail.

If you're using FileInfo / DirectoryInfo, you're explicitly targeting a file-system item, and that very item - even if it happens to be a link to another item - should be represented as such by default. Requesting information about that link's target should always require opt-in.

I'm not fundamentally opposed to your proposal of a new FileInfo(string path, bool followLinks) constructor, but it seems to me that .GetTarget(bool returnFinalTarget = false) is a better way to address this use case, given that you then get the choice between the immediate and the ultimate target path.

iSazonov commented 3 years ago
  • Introduce a System.IO.LinkType Enum type that covers all link types on all platforms: None, HardLink, SymLink, plus - Windows-specifically - Junction, MountPoint, AppX

Windows reparse points are open list https://docs.microsoft.com/en-us/windows/win32/fileio/reparse-point-tags (current list https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4)

We can not create the Enum in common case.

I don't know what is a best practice in .Net Runtime. Perhaps there ae examples how better resolve such case.

Also we need to take into account named vs non-named reparse points. They have different behavior.