dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Introduce mechanism to indicate arguments are to be marshalled as native varargs #48796

Open AaronRobinsonMSFT opened 3 years ago

AaronRobinsonMSFT commented 3 years ago

Background and Motivation

Native varargs are a complicated interop scenario to support. At present, native varargs are only supported on the Windows platform through the undocumented __arglist keyword. Supporting varargs naturally in a P/Invoke scenario would be difficult from the C# language. However, it is possible to compromise by permitting support for the call with a fully specified DllImport signature and a hint from the user.

User scenario: https://github.com/dotnet/runtime/issues/48752

Proposed API

namespace System.Runtime.InteropServices
{
+    [AttributeUsage(AttributeTargets.Method)]
+    public class NativeVarargsAttribute : Attribute
+    {
+        public NativeVarargsAttribute() { VarargBeginIndex = 0; }
+
+        /// <summary>
+        /// Zero-based index of the first variable argument.
+        /// </summary>
+        public int VarargBeginIndex;
+    }
}

Usage Examples

Consider the following native export with varargs:

void Varargs(int n, ...);

The following P/Invoke declarations would enable users to call and properly forward the arguments in a supported multi-platform manner.

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs0(int n);

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs1(int n, int a);

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs2(int n, int a, int b);

Alternative designs

Encode the information in the CallingConvention enum. This approach does remove the overhead of attribute reading, but does miss the added data of knowing where the varargs start - at present doesn't appear to be needed. This approach also impacts existing metadata tooling (for example, ILDasm, ILAsm, and ILSpy). See https://github.com/dotnet/runtime/issues/48796#issuecomment-786355000.

public enum CallingConvention
{
+    VarArg = 6
}

Current state

Without this feature, calling functions with native varargs isn't possible on a non-Windows platforms. The proposed workaround is to create native shim libraries and instead P/Invoke into them. Continuing the example above, the shim library would export the following:

extern void Varargs(int n, ...);

void Varargs0(int n)
{
    Varargs(n);
}
void Varargs1(int n, int a)
{
    Varargs(n, a);
}
void Varargs2(int n, int a, int b)
{
    Varargs(n, a, b);
}

References

Support on Windows: https://github.com/dotnet/coreclr/pull/18373 JIT details: https://github.com/dotnet/runtime/issues/10478

filipnavara commented 3 years ago

How do you use UnmanagedCalleeAttribute to specify where the fixed args end and the varargs start?

AaronRobinsonMSFT commented 3 years ago

How do you use UnmanagedCalleeAttribute to specify where the fixed args end and the varargs start?

That is a UX question we would need to work through but I am not sure that is needed based on a previous conversation - which I may be misremembering. The gist was regardless of where the fixed args or varargs exist the way they are passed is the same for a native varargs signature - I could be misremembering the conversation but that is what I recall. There is also the more fundamental UX question of do we want to support arbitrary native vararg signatures in C# or do we require people to fully specify each signature for each use case. I don't know the answer to that but this proposal as written isn't something I think we should do and as the author I felt it should be closed to avoid confusing it as a well-thought out proposal - which in its current form it is not.

filipnavara commented 3 years ago

That is a UX question we would need to work through but I am not sure that is needed based on a previous conversation - which I may be misremembering. The gist was regardless of where the fixed args or varargs exist the way they are passed is the same for a native varargs signature - I could be misremembering the conversation but that is what I recall.

On x64 you don't need the information. On arm64 you do, varargs always go to stack, up to 8 fixed args go to registers. You pointed out in the other issue (#51156) that it replaces this proposal but it left out this fundamental issue which was the driving impetus for this proposal and the user scenarios in #48752.

The UnmanagedCalleeAttribute does not address this problem in any way as far as I can see it and thus it felt flat to just close the proposal with no alternative. There was just a brief mention that it would address varargs too but it didn't actually address the user scenario or the problem that is address with the attribute in this API proposal.

There doesn't seem to be any mechanism in UnmanagedCalleeAttribute to pass the varargs start position and it's not obvious how would it fit in.

There is also the more fundamental UX question of do we want to support arbitrary native vararg signatures in C# or do we require people to fully specify each signature for each use case.

True. That's also why I think all the discussion here is really valuable because it focused on various reasons why one or the other approach may work in different scenarios (such as NativeAOT).

The reality is that binding libraries now have to manually craft each prototype for all possible varargs combinations if they want to work cross-platform. That doubles with additional logic if ARM64 comes into play. __arglist is currently supported only on Windows and it's problematic for AOT scenarios, as stated in the discussion above.

I don't know the answer to that but this proposal as written isn't something I think we should do and as the author I felt it should be closed to avoid confusing it as a well-thought out proposal - which in its current form it is not.

Fair. However, there was a lot of valuable discussion here in the issue. There's currently some of the user-side of the problem covered in #48752 but lot of the technical details ended up being discussed here instead. #51156 didn't address the ARM64 varargs problem in any way nor did it suggest how it could help address it.

filipnavara commented 3 years ago

Honestly, I think this should stay open with api-needs-work tag and explanation of why the approach in the original proposal is not good. Just my $0.02.

frindler commented 2 years ago

Is there any progress on supporting varargs in P/Invoke'd native libs? I just patched the padding for varargs to the 9th argument position into Libgit2sharp on OS X arm64, but there should be way to do this without this hack... (which may break again in the future)

jgcodes2020 commented 2 years ago

Just my two cents:

C#, F#, and VB.Net all have a feature where variadic arguments can be passed via a specially marked array parameter at the end. Coupled with an attribute to identify it, perhaps you could have:

[DllImport("libc.so", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
public static extern void printf(string format, 
  [MarshalAs(UnmanagedType.VariableArguments)] params object[] args)
webczat commented 2 years ago

This would probably add overhead which is often not necessary, and that's probably not the only problem with that.

W dniu 20.08.2022 o 03:38, jgcodes2020 pisze:

Just my two cents:

C#, F#, and VB.Net all have a feature where variadic arguments can be passed via a specially marked array parameter at the end. Coupled with an attribute to identify it, perhaps you could have:

[DllImport("libc.so",CallingConvention = CallingConvention.Cdecl,CharSet = CharSet.Ansi)] public static extern void printf(string format, [MarshalAs(UnmanagedType.VariableArguments)]params object[]args)

— Reply to this email directly, view it on GitHub https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fruntime%2Fissues%2F48796%23issuecomment-1221201608&data=05%7C01%7C%7C492e82455ea444aada6008da824c9f96%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637965562844949661%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=P2ymAQ2eBHRnqz96zHdOOzopOw1mlM8DU2KXLlE4Z1Y%3D&reserved=0, or unsubscribe https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAANTEOWG72KLBJFMZ6EUKCTV2AZHVANCNFSM4YHRUSTQ&data=05%7C01%7C%7C492e82455ea444aada6008da824c9f96%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637965562844949661%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TBl7qy3WFKCTumZmhmKTvwCvIVi9AU3nGxCFaVj4XwQ%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

TomEdwardsEnscape commented 1 year ago

When I read through the history of this issue, I can see a clear case of perfect becoming the enemy of good. Attempts to support declaring variadic parameters in metadata have soaked up time and energy that could have been spent supporting variadic methods on Mac and Linux at all.

I came across this problem when wrapping Apple's objc_msgSend, which is one of those "it does everything" methods through which a huge number of unrelated calls are routed. (It's usually unwise to write such a method, but I'll give Apple a pass in this case since it's part of the Objective-C runtime and there is supposed to be an entire compiled language sat on top of it.)

Even if we could write a delegate and specify its variadic parameters, we don't want to because there are so many different ways to call objc_msgSend. Many are unique, at least within the set of methods/properties that we want to use.

__arglist is already the perfect solution for our needs. We don't care that it's slower than theoretical alternatives and not type safe. We just want to wrap this one specific call, pass it the appropriate arguments, then move on.

Here is a trimmed version of the code that I would like to write:

public class NSRunningApplication : NSObject
{
    // 1. Define a variadic native method
    [DllImport("/System/Library/Frameworks/AppKit.framework/AppKit")]
    private static extern nint objc_msgSend(nint receiver, nint selector, __arglist);

    // 2. Pass an __arglist
    public static NSRunningApplication? RunningApplicationWithProcessIdentifier(int pid) => 
        FromPointer(objc_msgSend(Class, RunningApplicationWithProcessIdentifierSelector, __arglist(pid)));
}

As we well know, this fails at runtime because __arglist is only supported on Windows.

Since we only need a limited number of Objective-C interop calls, we will proceed with the hack of padding the parameter list on Arm64. This is a very poor solution, but at least it avoids having to compile a new native library full of very specific objc_msgSend exports for my team's otherwise 100% managed codebase.

AaronRobinsonMSFT commented 1 year ago

__arglist is already the perfect solution for our needs.

It doesn't though. The __arglist concept is for native variadic arguments - let's ignore the platform specific nature of it for now. The objc_msgSend suite of APIs aren't native variadic arguments. The original signature did use ..., which confused many people, but the use of the API was not using "varargs" in most cases, but rather the signature of the target function. Thankfully Apple corrected the signature a few years ago and now removes some of this confusion as it is defined as void objc_msgSend(void). @mikeash wrote a great description of it and I recommend it - see here.

This means that even if we decided to support __arglist as "varargs" it wouldn't help with the objc_msgSend example.

There are definitely use cases for native varargs and there is interest, but the statement "I can see a clear case of perfect becoming the enemy of good" isn't an accurate description of the nature of it being lower priority. The lower priority is based on the narrow cases where it is is needed - real varargs. There are the printf suite of functions but those really have little utility in a .NET application. Yes, there are cases where people do create and want to call Objective-C or C/C++ varargs functions from .NET and we acknowledge the work around is painful, but the interest just hasn't pushed this to the place where it is prioritized higher.

TomEdwardsEnscape commented 1 year ago

I see, so __arglist implicitly inserts more than just the arguments passed to it? That is indeed not suitable for objc_msgSend, and I can also see the problems that could arise should multiple definitions of "variadic" exist on the same platform.

Solving the problem at its root would require something even lower level than __arglist, such as a similar keyword that inserted just the memory of the objects passed to it and left further decoration or padding to the delegate author. But this seems like an even bigger ask than the current proposals.

My opening statement wasn't describing the priority of the issue, I entirely understand why it's low. I was talking about how high-level parameter attributes were discussed for a long time, instead of starting with a low-level solution which unblocks the functionality and then refining/optimising it later. It was a bit frustrating to watch the years tick by.

(Aside: I'm interested to know how casting objc_msgSend works in native C/C++ on Arm64. Do you also have to insert the padding arguments that we see in C#? If not, how does the compiler know to put the arguments onto the stack instead of into registers?)

AaronRobinsonMSFT commented 1 year ago

I see, so __arglist implicitly inserts more than just the arguments passed to it?

It doesn't necessarily insert more, rather it instructs the JIT to pass the arguments in a certain way. That way conforms to the MSVC implementation of varargs and specifically how C++/CLI assumes they work. The details of how to pass floating point values via varargs is particularly treacherous on most platforms.

Solving the problem at its root would require something even lower level than __arglist, such as a similar keyword that inserted just the memory of the objects passed to it and left further decoration or padding to the delegate author.

That is definitely an option. However, the minimum needed is simply to have an indication for the JIT to know to pass the arguments using the varargs calling convention that is appropriate for the current platform - encoding all of that in the JIT is the real cost. There are some differences of opinion on precisely how one might want to do this. One approach is to provide a "placeholder" (that is, __arglist in C# or ... in C/C++). Another approach is to adorn the method with a calling convention that indicates to pass the arguments as they would be passed using varargs. The former makes the feature more natural from a language perspective but the latter is easier to express and handle in the runtime. A concrete example is helpful.

Using __arglist:

[DllImport(@"...")]
extern static void printf(string fmt, __arglist);

// Usage:
printf("%d", 5);
printf("%f", 3.14);

Use a new attribute:

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"...")]
extern static void printf(string fmt, int a);
[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"...")]
extern static void printf(string fmt, double a);

// Usage:
printf("%d", 5);
printf("%f", 3.14);

I'm interested to know how casting objc_msgSend works in native C/C++ on Arm64.

That is the agreed upon ABI for varargs.

TomEdwardsEnscape commented 1 year ago

The attribute-plus-declared-parameters approach is certainly nice. But as I think objc_msgSend demonstrates, it's overly restrictive for some notable real-world use cases. A means to define the arguments at runtime, as __arglist already provides, is valuable.

Unless I misunderstand, the JIT needs to know not just which parts of the method should be variadic, but also which variadic convention should be used. To pose this as a question: does __arglist fail on Windows if the target method was compiled using GCC's libc?

Assuming yes, how about this proposal: multiple keywords, one per supported variadic convention.

extern static void printf(string fmt, argList_raw); // just pass as if normal arguments; would work for objc_msgSend
extern static void printf(string fmt, argList_msvc); // alias for the current __arglist
extern static void printf(string fmt, argList_libc); // other conventions added as deemed appropriate

MSVC support is obvious, and I include "raw" partly because it's what I need, but also because it's the simplest possible convention. Going further with libc etc. is up for debate.

If it's not practical to add multiple keywords, then the variadic convention can instead be declared in a method attribute.

Fully-typed signatures can still be achieved by wrapping the extern in a normal C# method:

[DllImport(@"...")]
extern static void printf(string fmt, argList_msvc);

static void printf(string fmt, double a) => printf(fmt, argList_msvc(a));

I'm sure this would be slower to execute than an extern decorated with NativeVarargsAttribute, but it's better to have an optimisation problem than a functionality problem. In this particular example the JIT knows exactly which types are being passed, so could in future use that information to optimise the call.

The biggest uncertainty I have is this: how much harder it is to support this approach in the runtime, compared to attributes?

That is the agreed upon ABI for varargs.

Good, so that can be hardcoded on Arm64 regardless of the variadic convention in use. One less language feature required!

TheLastRar commented 1 year ago

It also looks like the new attribute solution wouldn't handle the situation of reverse p/invoke well. the argList_* solution looks at least at a glance better suited for reverse p/invoke,

How well does each solution handle a function defined with va_list instead of ...? I guess you could just have an [DllImport(@"va_list")] like you have a [DllImport(@"...")] in the above example, or maybe a [MarshalAs(UnmanagedType.VaList)] instead of a default of [MarshalAs(UnmanagedType.VariadicArguments)] on the argList_* type.

ohroy commented 1 week ago

any update ?