dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.05k stars 4.03k forks source link

Define documentation comment ID for function pointers #48363

Open svick opened 4 years ago

svick commented 4 years ago

The documentation comment ID format for function pointers doesn't seem to be defined. Consider the following code (using recent daily build of Roslyn):

string code = @"
class C
{
    unsafe void M(delegate*<int, void> p, int i) {}
}";

var tree = SyntaxFactory.ParseSyntaxTree(code);

var compilation = CSharpCompilation.Create(null, options: new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary, allowUnsafe: true))
    .AddSyntaxTrees(tree)
    .AddReferences(MetadataReference.CreateFromFile(typeof(object).Assembly.Location));

var method = compilation.GetSymbolsWithName("M").Single();
Console.WriteLine(DocumentationCommentId.CreateDeclarationId(method));
Console.WriteLine(method.GetDocumentationCommentId());

This prints:

M:C.M(,System.Int32)
M:C.M(,System.Int32)

which shows empty string as the documentation comment ID for a function pointer parameter.

The function pointers proposal doesn't define this format either.

How should the documentation comment ID format for function pointers look like? Once that's decided, the implementation should be updated so that documentation comment IDs for members with function pointers start working.

Originally reported at https://github.com/dotnet/csharplang/discussions/3978.

333fred commented 4 years ago

Whatever we do for pointer types, we should do the same here. Fundamentally, at the runtime level function pointers are just a type of pointer.

svick commented 4 years ago

@333fred I don't quite understand how does that help define the format for function pointers. Consider the following table:

C# IL doc Id
int* int32* System.Int32*
delegate*<int, void> method void *(int32) ???

The syntax for regular pointers is simple and it's the same everywhere: just append * to the pointed type. But the syntax for function pointers is more complex and inconsistent in the existing languages, so it's not obvious to me what the right format for their documentation comment ID is. And that's before considering calling conventions.

333fred commented 4 years ago

I think it should follow the metadata encoding, in order to be language-agnostic.

saucecontrol commented 3 years ago

Can it not match what's already documented for MSVC? https://docs.microsoft.com/en-us/cpp/build/reference/dot-xml-file-processing?view=msvc-160

ELEMENT_TYPE_FNPTR is represented as "=FUNC:type(signature)", where type is the return type, and signature is the arguments of the method. If there are no arguments, the parentheses are omitted.

There's a note that says MSVC never generates that ID type, but this C++/CLI example:

public ref class DocTest {
public:
    typedef void (*function_ptr)(char, int);

    /// <summary>Test fnptr doc</summary>
    void FnptrTest(function_ptr f);
};

generates the following IL sig:

.method public hidebysig instance void  FnptrTest(method unmanaged cdecl void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) *(int8 modopt([mscorlib]System.Runtime.CompilerServices.IsSignUnspecifiedByte),
                                                  int32) f) cil managed

and this XML:

<member name="M:DocTest.FnptrTest(=FUNC:System.Void(System.SByte!System.Runtime.CompilerServices.IsSignUnspecifiedByte,System.Int32))">
<summary>Test fnptr doc</summary>
</member>

Note that they also define a syntax for modreq and modopt to disambiguate DocIDs. The notes say those are also not generated by MSVC, but you can see one present on the SByte parameter above.

Calling convention is also included by MSVC as a modopt, but CallConvCdecl is omitted from the DocID. I believe I've seen the others included.

I would be very much in favor of keeping as much consistency between the compilers as possible. I have a tool that handles <inheritdoc /> inheritance across compiled assemblies, and it's inconvenient when assemblies produced by different compilers have their DocIDs generated differently.

333fred commented 3 years ago

Can it not match what's already documented for MSVC? docs.microsoft.com/en-us/cpp/build/reference/dot-xml-file-processing?view=msvc-160

If MSVC has a format for these, then I would say that we should follow that precedent. However, we will need to include the calling convention in the signature, as we allow overloading on calling convention:

public unsafe class C {
    public void M(delegate*<void> ptr) {}
    public void M(delegate* unmanaged[Cdecl]<void> ptr) {}
}

In particular, we'll have to define something for the delegate*<void> vs delegate* unmanaged<void> case, as there is no modopt or modreq involved here, but they are allowed.

saucecontrol commented 3 years ago

I just gave it another try, and MSVC does always include a modopt in the IL for the calling convention but doesn't use it in the DocID. I was able to get it to generate duplicate DocIDs by overloading by calling convention.

typedef void (__cdecl *function_ptr_cdecl)(int);
typedef void (__stdcall *function_ptr_stdcall)(int);

public ref class C {
public:
    /// <summary>Test fnptr cdecl doc</summary>
    void M(function_ptr_cdecl f) { }
    /// <summary>Test fnptr stdcall doc</summary>
    void M(function_ptr_stdcall f) { }
};

outputs:

<member name="M:C.M(=FUNC:System.Void(System.Int32))">
<summary>Test fnptr cdecl doc</summary>
</member>
<member name="M:C.M(=FUNC:System.Void(System.Int32))">
<summary>Test fnptr stdcall doc</summary>
</member>

According to their specs, I would expect the DocIDs for those to be M:C.M(=FUNC:System.Void!System.Runtime.CompilerServices.CallConvCdecl(System.Int32)) and M:C.M(=FUNC:System.Void!System.Runtime.CompilerServices.CallConvStdcall(System.Int32)) respectively.

Would that be workable for the C# side, with the managed calling convention undecorated?

Edit: Sorry, I misread your comment, as I think there's a missing no in there. I see now that Roslyn doesn't emit a modopt in the unmanaged case, so this would need to use a different decoration.

333fred commented 3 years ago

Sorry, I misread your comment, as I think there's a missing no in there.

Fixed, thanks. And yes, we'll need it not just for the unmanaged case, but for the rest of the cases as well. MSVC emits modopts in addition to setting the CallKind bit in IL, but we just set the CallKind unless we're overriding a signature that has the modopt.

terrajobst commented 1 year ago

For reference, it seems CCI has a syntax that is similar to the one from VC:

M:Class1.DoIt(System.Int32,function System.Single (System.Int32))
M:Class1.DoIt(System.Int32,function System.Void (System.Single))

it seems easiest to add the calling convention right after the word function and before the return type, like

M:Class1.DoIt(System.Int32,function System.Single (System.Int32))
M:Class1.DoIt(System.Int32,function unmanaged System.Void (System.Single))
M:Class1.DoIt(System.Int32,function unmanaged cdecl System.Void (System.Single))

If we believe there is value in matching what VC has done we can do that too. But it seems if the compilers differ by how they emit (i.e. with or without mod opts) then maybe that's a red herring.

FWIW, I think the scenarios where someone needs to consume them between C++ and C# are low, given that the support for C++/CLI is fairly limited these days. However, there is value in having a well-defined (and unique) format within the C# ecosystem itself.