dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Introduce mechanism to indicate arguments are to be marshalled as native varargs #48796

Open AaronRobinsonMSFT opened 3 years ago

AaronRobinsonMSFT commented 3 years ago

Background and Motivation

Native varargs are a complicated interop scenario to support. At present, native varargs are only supported on the Windows platform through the undocumented __arglist keyword. Supporting varargs naturally in a P/Invoke scenario would be difficult from the C# language. However, it is possible to compromise by permitting support for the call with a fully specified DllImport signature and a hint from the user.

User scenario: https://github.com/dotnet/runtime/issues/48752

Proposed API

namespace System.Runtime.InteropServices
{
+    [AttributeUsage(AttributeTargets.Method)]
+    public class NativeVarargsAttribute : Attribute
+    {
+        public NativeVarargsAttribute() { VarargBeginIndex = 0; }
+
+        /// <summary>
+        /// Zero-based index of the first variable argument.
+        /// </summary>
+        public int VarargBeginIndex;
+    }
}

Usage Examples

Consider the following native export with varargs:

void Varargs(int n, ...);

The following P/Invoke declarations would enable users to call and properly forward the arguments in a supported multi-platform manner.

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs0(int n);

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs1(int n, int a);

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs2(int n, int a, int b);

Alternative designs

Encode the information in the CallingConvention enum. This approach does remove the overhead of attribute reading, but does miss the added data of knowing where the varargs start - at present doesn't appear to be needed. This approach also impacts existing metadata tooling (for example, ILDasm, ILAsm, and ILSpy). See https://github.com/dotnet/runtime/issues/48796#issuecomment-786355000.

public enum CallingConvention
{
+    VarArg = 6
}

Current state

Without this feature, calling functions with native varargs isn't possible on a non-Windows platforms. The proposed workaround is to create native shim libraries and instead P/Invoke into them. Continuing the example above, the shim library would export the following:

extern void Varargs(int n, ...);

void Varargs0(int n)
{
    Varargs(n);
}
void Varargs1(int n, int a)
{
    Varargs(n, a);
}
void Varargs2(int n, int a, int b)
{
    Varargs(n, a, b);
}

References

Support on Windows: https://github.com/dotnet/coreclr/pull/18373 JIT details: https://github.com/dotnet/runtime/issues/10478

AaronRobinsonMSFT commented 3 years ago

/cc @mangod9 @sandreenko @sdmaclea @jkoritzinsky @elinor-fung @lambdageek

sdmaclea commented 3 years ago

/cc @jkotas @janvorli

jkoritzinsky commented 3 years ago

The _arglist keyword is supported for interop with P/Invoke on Windows platforms and the JIT already knows how to understand the encoding.

I think we should just use the _arglist keyword instead of adding a new attribute. The attribute read is much more expensive.

sdmaclea commented 3 years ago

Maybe the trick is undocumented -> documented

AaronRobinsonMSFT commented 3 years ago

I think we should just use the _arglist keyword instead of adding a new attribute. The attribute read is much more expensive.

Not sure the attribute read should be a concern here to be honest. The generation of the stub is likely to be more costly than that. Also that read would be a single time.

Also, this is not about supporting varargs as it looks on Windows. This is about supporting the calling of varargs with a fully typed P/Invoke. The keyword would have no where to go and is not a placeholder we are likely to want.

tannergooding commented 3 years ago

Supporting varargs naturally in a P/Invoke scenario would be difficult from the C# language.

What is the difficulty here? Is it not largely just the same as it does today, which is to push the args onto the execution stack and so (other than it being not an official keyword) it would largely just be the runtime handling it correctly for other platforms?

jkotas commented 3 years ago

the attribute read

This can be fixed by combining this with adding a new member to CallingConvention enum. We would pay the attention to the advanced calling convention attributes only for the Extended calling conventions.

public enum CallingConvention
{
    Extended = 6
}

Alternatively, this can be just a new CallingConvention value:

public enum CallingConvention
{
    VarArg = 6
}

__arglist keyword

The __arglist keyword is poorly supported corner case. See e.g. https://github.com/dotnet/docs/issues/18714 .

We should also think how we would enable this with function pointers. __arglist keyword is not supported with function pointers today.

varargs as it looks on Windows

Note that varargs interop on Windows supports both forward and reverse interop, and both ... and va_list. It is what makes it very complex and expensive to just do what we do on Windows accross all platforms.

sdmaclea commented 3 years ago

What is the difficulty here?

ABIs vary across platforms. Some platforms can treat vararg arguments differently than normal arguments. On Apple Silicon, they are definitely treated differently.

So for interop we need to be explicit to support al platforms correctly.

AaronRobinsonMSFT commented 3 years ago

This can be fixed by combining this with adding a new member to CallingConvention enum.

I was avoiding that since it would be encoded directly in metadata through DllImport. My previous experience was we prefer to avoid that given the impact to tooling.

tannergooding commented 3 years ago

ABIs vary across platforms. Some platforms can treat vararg arguments differently than normal arguments.

Sure, but I don't see why __arglist (or a new properly supported keyword) cannot be properly handled by the JIT to be the correct ... for a given platform. The JIT is entirely stack based and it is free to translate that as appropriate to be registers, stack, or whatever else is appropriate for the ABI. It has to do this for all arguments already.

A method in native is written as void method(...) and is then handled via va_start, va_arg, va_copy, va_end, and va_list. So one would expect that we could simply have a corresponding concept for ... in .NET (currently __arglist which translates to the ECMA 335 vararg convention) and the relevant helpers (currently ArgIterator).

So I'm not seeing what is blocking the same from working on say Unix or Apple Silicon, other than the JIT not correctly handling these on non-windows. I would expect that if we just simply implemented the ABI defined (https://gitlab.com/x86-psABIs/x86-64-ABI for System V) then both forward and reverse P/Invoke would work (and that this is necessary anyways for whatever new concept is exposed)

AaronRobinsonMSFT commented 3 years ago

Sure, but I don't see why __arglist (or a new properly supported keyword) cannot be properly handled by the JIT to be the correct ... for a given platform.

I don't think there is any reason other than this proposal is for a non-language impacting update :-) We can also start a conversation with Roslyn about how one would express this in C#.

jkotas commented 3 years ago

I was avoiding that since it would be encoded directly in metadata through DllImport. My previous experience was we prefer to avoid that given the impact to tooling.

Yes, it has an impact on tooling. My actual concern is that we need a scalable pattern to add new calling conventions for DllImport. Let's say that we add 10 new calling conventions. What is the pattern we want to follow? Is it going to be 10 different attributes?

tannergooding commented 3 years ago

I don't think there is any reason other than this proposal is for a non-language impacting update

__arglist is a "reserved" keyword today, you can't even name a class with it (at least in Roslyn you get error CS0190: The __arglist construct is valid only within a variable argument method if you try). Maybe its as simple as just spec'ing the thing that has always been there and making it an "official" keyword? It's not pretty, but it's also really only for P/Invoke scenarios, so maybe that isn't the worst thing.

AaronRobinsonMSFT commented 3 years ago

Let's say that we add 10 new calling conventions. What is the pattern we want to follow? Is it going to be 10 different attributes?

Great question. I've not spent much time considering that as it relates to DllImport. One thing I think we can all agree on is the desire to not have 10 attributes.

Maybe its as simple as just spec'ing the thing that has always been there and making it an "official" keyword? It's not pretty, but it's also really only for P/Invoke scenarios, so maybe that isn't the worst thing.

Sure, seems like a reasonable perspective. The __ prefix does have some implementation/undocumented semantics associated with it so it would need to be a new keyword for proper support and thus language impacting.

jkotas commented 3 years ago

If we go with some variant of __arglist keyword, we should also figure out how it is going to work with the interop source generators that is our forward looking interop story.

tannergooding commented 3 years ago

Sure, seems like a reasonable perspective. The __ prefix does have some implementation/undocumented semantics associated with it so it would need to be a new keyword for proper support and thus language impacting.

I think that's fine. It might be that they say __arglist just becomes official and it could be that they say they now support ... or arglist contextually or something else. But I'd also expect (and hope) its "less" work given the existance of __arglist already wired fairly end to end (but someone from LDM could confirm).

If we go with some variant of __arglist keyword, we should also figure out how it is going to work with the interop source generators that is our forward looking interop story.

What consideration is needed here? Is this exposing a managed helper signature because exposing a managed method which takes ... is undesirable?

I had already experimented with having ClangSharp recognize variadic functions and generate __arglist but never checked it in because it was Windows only and not officially supported. The pattern seems fairly straightforward...

AaronRobinsonMSFT commented 3 years ago

What consideration is needed here? Is this exposing a managed helper signature because exposing a managed method which takes ... is undesirable?

Yes basically. Consider how our prototype source gen works:

[GeneratedDllImport(NativeExportsNE_Binary, EntryPoint = "sumi")]
public static partial int Sum(int a, <... or __arglist>);

The above varargs would need to be marshalled in some manner. Which means the variable argument list would need to be inspected in an efficient way at run time to perform that marshalling and forward those arguments.

tannergooding commented 3 years ago

Could the source generator not just do the concrete overloads. Say for example you had

[GeneratedDllImport("msvcrt", EntryPoint = "printf")]
public static int Print(string format, __arglist);

For this, the generator would create:

[DllImport("msvcrt")]
private static extern printf(sbyte* format, __arglist);

And for each unique invocation it found, it would generate a helper override that is a more exact match. For example, if you did Print("%s%s%s", "Hello", ", ", "World!"); and Print("%g", 100.0f);

The following two helpers would be generated:

public static partial int Print(string format, string arg0, string arg1, string arg2)
{
    fixed (byte* pFormat = Encoding.GetUTF8Bytes(format))
    fixed (byte* pArg0 = Encoding.GetUTF8Bytes(arg0))
    fixed (byte* pArg1 = Encoding.GetUTF8Bytes(arg0))
    fixed (byte* pArg2 = Encoding.GetUTF8Bytes(arg0))
    {
        return printf(pFormat, pArg0, pArg1, pArg2);
    }
}

public static partial int Print(string format, float arg0)
{
    fixed (byte* pFormat = Encoding.GetUTF8Bytes(format))
    {
        return printf(pFormat, format);
    }
}

The only "loose" end would be that nothing was generated for the original public static int Print(string format, __arglist) and a decision on what to do here would need to be handled.

AaronRobinsonMSFT commented 3 years ago

And for each unique invocation it found, it would generate a helper override that is a more exact match.

Yep. Until a library decided to expose the P/Invoke directly at which point it may not observe any concrete calls.

tannergooding commented 3 years ago

Yep. Until a library decided to expose the P/Invoke directly at which point it may not observe any concrete calls.

Isn't that the same world we already have today if someone does the following on Windows?

[DllImport("msvcrt")]
public static extern printf(sbyte* format, __arglist)

Can we not maintain the same support or can we not just block non-blittable parameters here (like we do for generics and a few other scenarios)?

AaronRobinsonMSFT commented 3 years ago

Isn't that the same world we already have today if someone does the following on Windows?

I don't think so. I believe a library can export that signature as is today without issue. But your proposal for the source generator approach wouldn't work for that case because it won't see the callsite when the application calls the libraries export.

Can we not maintain the same support or can we not just block non-blittable parameters here (like we do for generics and a few other scenarios)?

I guess we could impose that, but my perspective is it isn't worth it. The proposed approach makes calling a vararg function possible and will naturally work with source generators in all scenarios since we tell the JIT how to pass the arguments. Supporting the gamut of varargs doesn't seem to be worth the cost at this point. It would impose a large burden on the source generator because the convention would need to be forwarded properly and the language updated. I'm simply not seeing the value in the cost to make it fully supported.

AaronRobinsonMSFT commented 3 years ago

I'm simply not seeing the value in the cost to make it fully supported.

Actually, I think it was also pointed out in https://github.com/dotnet/runtime/issues/48796#issuecomment-786360845 that a lot of the JIT work will be the same - I agree with that. We can view this proposal with its requirement of a precise signature to be a down payment on enabling full support. Since if this proposal was accepted all we would need to do is address two additional issues:

1) Make it an official scenario in C#. 1) Ensure we have the facilities to make it work with source generators.

I think this proposal simply starts the journey towards full support in an MVP manner.

tannergooding commented 3 years ago

So you're basically thinking something like the following for the minimum viable product and then "full" varargs with language support might come later and would also enable the same on function pointers?

[NativeVarargsAttribute(VarargBeginIndex = 1)]
[DllImport(@"NativeLibrary.dll", EntryPoint = "Varargs")]
static extern void Varargs1(int n, int a);

That doesn't sound terrible, but it seems to me like you still have the problem that the user needs to know all overloads they require up front and that in the future you have two technologies to choose from. And at first thought, it seems like something that could equally be handled by [GeneratedVarargs(new Type[] { typeof(arg0), typeof(arg1) })] on the method that has GeneratedDllImport for the __arglist case.

The only unique issues to ... (at least that I'm seeing) is that:

It would seem like the MVP would be to just block the first scenario and ask C# on the second (and fallback to something different if C# can't commit). Requiring users to pass blittable parameters for ... and to use one of the source generator attributes otherwise seems reasonable for an MVP. If that scenario is very important, it could be unblocked in the future. Then it is functionally no different than the original proposal, including in generating the fixed signatures up front based on the overloads the user indicates they desire, just using the same thing that eventually becomes "full support". However, it does enable automatic generation of those fixed signatures in the case the export is not public and also enables power users to explicitly pass in blittable parameters if that is preferred instead.

sdmaclea commented 3 years ago

@AaronRobinsonMSFT Instead of adding NativeVarargsAttribute could we add a new optional parameter to DllImport which did the same thing. Might have a default not varargs value by default.

jkotas commented 3 years ago

DllImport is a pseudo-custom attribute. It has special non-extensible encoding in metadata (extending the encoding would be file format breaking change.)

I have been thinking about what would be a design that works for both DllImport and function pointers, without language changes. The simplest design that I was able to come up with is a marker type for start of vararg arguments:

public struct VarArgSentinel
{
}

Example of use:

[DllImport("msvcrt")]
public static extern printf(sbyte* format, VarArgSentinel dummy, int arg1, int arg2);
tannergooding commented 3 years ago

I had a bit of a chat with Aaron in teams and he suggested I explicitly iterate that my concern here is that we already have varargs today specified by ECMA-335 for both managed and native (and function pointers) and so I don't see the driving need to expose something additional that is basically that, but in many ways covers less (and would likely eventually be added to the spec if we ever rev it). I'd have less concern if this attribute or the VarArgSentinel just proposed was recognized by C# and just treated as the vararg bit (just like __arglist is today): https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBLANgHwAEAGAAkIEYA6AJQFcA7DbAWxioEkmYoIAHAMo8AbtjAwAzgG4AsAChCAZnIAmUgGFSAb3mk95ZYRSkAsgAoAlNoC+u/Xb1Lyx8wH1XAQygBzXNgkYVlq2cvqkDqQA2gAiuLgcLHzQGGYARH7AqRYAuhFOlEikMAgYPAzOpABylrKh9nV6MXEJSVAp6diZOXmGFIXFpVDlRlVm7l6+/oG11kA

If __arglist wasn't an implementation defined keyword, I don't think there would even be a discussion here and so if asking Jared/Mads if making __arglist official or adding ... or a new contextual keyword is feasible (even if only for P/Invokes), that seems like goodness to me.

AaronRobinsonMSFT commented 3 years ago

@AaronRobinsonMSFT Instead of adding NativeVarargsAttribute could we add a new optional parameter to DllImport which did the same thing. Might have a default not varargs value by default.

@sdmaclea Yeah.. I would love that. It is a nice idea, but as @jkotas mentioned DllImport isn't what anyone thinks it is.

jkotas commented 3 years ago

If __arglist wasn't an implementation defined keyword, I don't think there would even be a discussion her

I agree that we would not have this discussion if varargs were a first class citizen in .NET.

Managed varargs are similar to remoting. Both are mentioned in ECMA, both have been neglected features with many gaps since .NET Framework 1.0, expensive to port, and omitted from .NET Core originally. The only difference is that we were forced to bring varargs in in limited fashion on Windows for managed C++ compatibility.

I think we should either agree that it is valuable to make varargs first class citizen in modern .NET; or to do the simplest cheapest solution that makes it possible to call varargs method in platform neutral way. Going halfway does not seem useful.

tannergooding commented 3 years ago

My view here is that varargs should get proper support in .NET, even if in an iterative fashion with the MVP for .NET 6 extending what ECMA-335 already supports and specifies.

My reasoning is that varargs is a part of the ABI, the C language, and the C standard library and is therefore on all platforms .NET will ever run on. It is a fundamental part of the ecosystem that is not going anywhere and will, without a doubt, be supported in any new platforms that exist in the future. It is not restricted to legacy code and is not restricted to only scenarios such as printf where the need for interop is "low". Additionally, searching for users asks about varargs shows continued interest going back to when we first made .NET cross platform.

Many major languages have support for varargs (most often via the ... syntax), particularly when it comes to their mechanism for interop with C. C#/.NET is one of the few where the only support is via their own thing (params T[]). Languages with varargs support include:

In some cases, varargs is a fundamental requirement for interop with languages other than C. For example, many of the core APIs exposed for ObjC (such as objc_msgsend) require varargs to be able to interact with it. Given that the other languages also have varargs support, there are also potential scenarios where it is required to interact with parts of them as well.

To me, this all points to a world where we should be taking the existing varargs support and making it first class starting with a minimum viable product that may involve only allowing it in P/Invoke scenarios.

tannergooding commented 3 years ago

In addition to the above, there are clear benefits to proper varargs support; such as reduced metadata bloat and decreased surface area for AOT scenarios.

varargs is like generics in that you can have a single metadata signature to support many code patterns. It is unlike generics in that it does not require generic explosion at runtime or during AOT. However, not properly supporting varargs means there is an explosion to specify every concrete combination that needs to be supported which increases metadata cost and inevitably the surface area for crossgen, AOT, and even just raw IL.

There will likely be some of this "explosion" in either scenario due to the need for an object oriented wrapper type (such as for ObjC types), but there is still the difference of a single objc_msgsend P/Invoke vs many separate P/Invokes that these wrappers ultimately call.

jkotas commented 3 years ago

Languages with varargs support include:

Number of the languages in your list do not have first class interop support and they represent varargs internally as array or array-like type that is more similar to C# params keyword.

For example, Java https://docs.oracle.com/javase/8/docs/technotes/guides/language/varargs.html : multiple arguments must be passed in an array, but the varargs feature automates and hides the process.

many of the core APIs exposed for ObjC (such as objc_msgsend) require varargs to be able to interact with it.

varargs interop does not help with objc_msgsend in general. objc_msgsend uses regular calling convention and you have to manually specify each shape. It would only help if the target method has literal vararg in the signature. Is there such method in Apple OS APIs?

decreased surface area for AOT scenarios.

varargs interop is problematic for AOT. ngen/crossgen never fully supported precompiling varargs interop. As you have said, it has similar problem as generics where one metadata item expands to many pieces of code. A whole program analysis is required to find the full set. Fully support AOTing varargs interop requires building a system similar to a system to precompile generics.

filipnavara commented 3 years ago

...but there is still the difference of a single objc_msgsend P/Invoke vs many separate P/Invokes that these wrappers ultimately call.

I'd like to reiterate what I stated in the Apple Silicon issue. The problem is not necessary the "explosion" caused by having one variation per parameter types. That had been the status quo for Xamarin ObjC interop, or the SQLite wrappers for ages. All the possible prototypes were either manually written (SQLite) or automatically generated (Xamarin). The issue is that with introduction of Apple Silicon the approach no longer works since the current workaround doubles the number of prototypes in this "explosion" if you have to account for the different ABIs (eg. x64 passes through registers with reserved stack space and overflow to stack for more parameters; M1 passes all VA on stack, so you have to round to 8 "dummy" parameters that occupy the registers and then let the rest overflow to stack to simulate the VA layout).

FWIW I'd be perfectly happy with having __arglist back, even if MVP means supporting it only for P/Invoke scenarios.

filipnavara commented 3 years ago

varargs interop does not help with objc_msgsend in general. objc_msgsend uses regular calling and you have to manually specify each shape. It would only help if the target method has literal vararg in the signature. Is there such method in Apple OS APIs?

The current approach to objc_msgSend is driven by the lack of varargs interop. The native method uses varargs (https://developer.apple.com/documentation/objectivec/1456712-objc_msgsend).

jkotas commented 3 years ago

The current approach to objc_msgSend is driven by the lack of varargs interop. The native method uses varargs

The native objc_msgSend takes variable number of arguments, but it is not the same mechanism as C vararg. If you define objc_msgSend as C method with ..., it won't work.

I went through the list of languages in @tannergooding list. I did not find any languages except C/C++ where C-like varargs are first class citizen (happy to be corrected if I missed anything). Varargs are either shortcut for array or array-like argument passing and/or one-off for interop.

filipnavara commented 3 years ago

The native objc_msgSend takes variable number of arguments, but it is not the same mechanism as C vararg. If you define objc_msgSend as C method with ..., it won't work.

You could be right on this one, but the C headers do use ... in there:

https://github.com/phracker/MacOSX-SDKs/blob/ef9fe35d5691b6dd383c8c46d867a499817a01b6/MacOSX10.14.sdk/usr/include/objc/message.h#L84-L86

Update: I am an idiot here reading the #if !OBJC_OLD_DISPATCH_PROTOTYPES the other way around. The variadic prototypes would work on x64 for some cases due to how the ABI is implemented but it doesn't quite work with the ARM64 ABI or more complex calls (eg. floating point parameters).

tannergooding commented 3 years ago

Varargs are either shortcut for array or array-like argument passing and/or one-off for interop.

Yes, on the non-interop side these are frequently represented in a slightly better construct (such as params T[] in C#) but in almost all cases they use ... and are largely equivalent to proper varargs from the author's perspective.

For example, Java https://docs.oracle.com/javase/8/docs/technotes/guides/language/varargs.html : multiple arguments must be passed in an array, but the varargs feature automates and hides the process.

However, JNA/JNI have explicit support for handling this as varargs in terms of P/Invoke and from the perspective of pure Java code it is varargs, not many separate concrete definitions.

The native method uses varargs (https://developer.apple.com/documentation/objectivec/1456712-objc_msgsend).

For reference, include\objc\message.h has:

#if !OBJC_OLD_DISPATCH_PROTOTYPES
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wincompatible-library-redeclaration"
OBJC_EXPORT void
objc_msgSend(void /* id self, SEL op, ... */ )
    OBJC_AVAILABLE(10.0, 2.0, 9.0, 1.0, 2.0);

OBJC_EXPORT void
objc_msgSendSuper(void /* struct objc_super *super, SEL op, ... */ )
    OBJC_AVAILABLE(10.0, 2.0, 9.0, 1.0, 2.0);
#pragma clang diagnostic pop
#else
OBJC_EXPORT id _Nullable
objc_msgSend(id _Nullable self, SEL _Nonnull op, ...)
    OBJC_AVAILABLE(10.0, 2.0, 9.0, 1.0, 2.0);

OBJC_EXPORT id _Nullable
objc_msgSendSuper(struct objc_super * _Nonnull super, SEL _Nonnull op, ...)
    OBJC_AVAILABLE(10.0, 2.0, 9.0, 1.0, 2.0);
#endif

Likewise, objc-api.h has:

/* OBJC_OLD_DISPATCH_PROTOTYPES == 0 enforces the rule that the dispatch 
 * functions must be cast to an appropriate function pointer type. */
#if !defined(OBJC_OLD_DISPATCH_PROTOTYPES)
#   if __swift__
        // Existing Swift code expects IMP to be Comparable.
        // Variadic IMP is comparable via OpaquePointer; non-variadic IMP isn't.
#       define OBJC_OLD_DISPATCH_PROTOTYPES 1
#   else
#       define OBJC_OLD_DISPATCH_PROTOTYPES 0
#   endif
#endif
tannergooding commented 3 years ago

I don't think that giving varargs "first class" support means that it will become the new standard for code or even recommended for use outside of P/Invoke scenarios. I think its primary use will be P/Invoke (via DllImport and function pointers), that there may eventually need to be Reverse P/Invoke support (or at least UnmanagedCallersOnly support), and that there may be some performance oriented scenarios for power users or internal methods that allow allocations to be avoided.

I also believe that the benefits of having a single metadata definition is clear and that it provides a more correct, maintainable, and functional mapping to the underlying code than concrete signatures.

In the majority of the languages I listed, this is done via the C varargs ABI because that is what is used by C and C is the common interface between most languages. This is true for imports (calling an exported C varargs method from that language) and often for exports as well (calling an exported method for that language from C).

Python is another example where some of the core runtime functions used for interop between it and C use varargs: https://docs.python.org/3/extending/extending.html#calling-python-functions-from-c, https://docs.python.org/3/c-api/arg.html#c.PyArg_Parse, etc

jkotas commented 3 years ago

I think its primary use will be P/Invoke (via DllImport and function pointers),

Yep, I agree that it is best to think about this as one-off for interop and not as a first class construct used everywhere.

there may eventually need to be Reverse P/Invoke support

Reverse P/Invoke support for varargs never existed (you cannot define vararg delegate in C#). I do not remember it ever showing up on the radar as a feature request.

a single metadata definition is clear and that it provides a more correct, maintainable, and functional mapping to the underlying code than concrete signatures.

I see a single metadata definition for C varargs interop as pain to deal with. It has to be special-cased everywhere.

webczat commented 3 years ago

what about the experiments with dllexport? one could have a need to make a c-compatible varargs method that is dllexported. outside of that, callbacks/etc rather don't have varargs, they at most have some void* userdata pointer, at least those I've seen. or similar mechanisms. And this discussion makes me wonder if some way to pass va_list would also be considered even in the case of not reusing/properly implementing dotnet's varargs support from ecma335?

tannergooding commented 3 years ago

I see a single metadata definition for C varargs interop as pain to deal with. It has to be special-cased everywhere.

I have the opposite opinion. I see it as something that eases the maintenance burden on P/Invoke generators, reduces metadata bloat, is well understood by the developers who will be using the feature, is already used by C++/CLI, and likely integrates well with what we already have today (minus the missing "official" language keyword for C# where seeing if __arglist can just become the official keyword seems like a good compromise between "existing" and "future, primarily limited to P/Invoke and power user scenarios").

From the perspective of a P/Invoke generator, it can process a header exactly 1-to-1. New initiatives like win32metadata (which will never know the concrete signatures required) can simply define the export once and have it understood by downstream tooling. Source generators like CsWin32 (which may know the concrete signatures required and which uses win32metadata) have a single method to interact with and can define concrete wrappers over that which do the pinning, marshaling, and calling of the single P/Invoke. Other generators like ClangSharp (which is used by win32metadata) can provide a single reusable export for power users or user-defined wrappers where appropriate

The work to support varargs at the ABI level should largely be the same for both approaches. The existing varargs support for Windows has to stay for back-compat. The logic for checking if parameters are blittable or not already exists in the marshaller and is already used to block certain P/Invoke scenarios. The logic for handling the existing varargs bit already exists, but is simply blocked on Unix and would be unblocked once the ABI support is added. The language is already wired up to emit the varargs bit for scenarios like P/Invoke, it just doesn't have an official keyword. The framework already has a type (System.ArgIterator) for getting and interacting with varargs parameters

lambdageek commented 3 years ago
   [DllImport("msvcrt")]
   public static extern printf(sbyte* format, VarArgSentinel dummy, int arg1, int arg2);

@jkotas why does it need to be a real argument, not an attribute on an existing arg?

[DllImport("msvcrt")]
public static extern printf (sbyte* format, [VarArgStart] int arg1, int arg2);

(also doesn't using a dummy parameter mean callers have to pass a dummy value for the argument? otherwise I don't see how this could be done without a language change)

jkotas commented 3 years ago

@jkotas why does it need to be a real argument, not an attribute on an existing arg?

You cannot have attributes on arguments in function pointers. (As I have said, my goal with this one was to make vararg interop possible for both DllImport and function pointers, without language changes.)

doesn't using a dummy parameter mean callers have to pass a dummy value for the argument?

Yes, that's correct. Callers would have to pass a dummy parameter.

rolfbjarne commented 3 years ago

@jkotas

It would only help if the target method has literal vararg in the signature. Is there such method in Apple OS APIs?

Yes, here's an example (they aren't very common, but they do show up in a few API, some quite important like this one): https://developer.apple.com/documentation/foundation/nsstring/1497301-localizedstringwithformat

and here's how we've had to bind it due to the lack of varargs:

https://github.com/xamarin/xamarin-macios/blob/effe7dc49986bdafeb3e8c72c8e907908095c7b9/src/Foundation/NSString.cs#L254-L280

jkotas commented 3 years ago

https://github.com/xamarin/xamarin-macios/blob/effe7dc49986bdafeb3e8c72c8e907908095c7b9/src/Foundation/NSString.cs#L254-L280

Does this work on Apple ARM64 OSes? If it does, the method is not actually using vararg calling convention. It has variable number of arguments, but the arguments are not passed using vararg calling convention, so adding support for vararg calling convention would not help.

rolfbjarne commented 3 years ago

https://github.com/xamarin/xamarin-macios/blob/effe7dc49986bdafeb3e8c72c8e907908095c7b9/src/Foundation/NSString.cs#L254-L280

Does this work on Apple ARM64 OSes? If it does, the method is not actually using vararg calling convention. It has variable number of arguments, but the arguments are not passed using vararg calling convention, so adding support for vararg calling convention would not help.

The managed code calls our own native functions (without varargs), which call Apple's API (with varargs): https://github.com/xamarin/xamarin-macios/blob/effe7dc49986bdafeb3e8c72c8e907908095c7b9/runtime/nsstring-localization.m#L22-L82

Here's an example of where we call Apple's API with varargs directly from managed code: https://github.com/xamarin/xamarin-macios/blob/effe7dc49986bdafeb3e8c72c8e907908095c7b9/src/UIKit/UIAppearance.cs#L134-L163

vargaz commented 3 years ago

Not sure if anyone mentioned it above, but the .net spec has a 'Sentinel' bit in method signatures used to mark where the ... arguments begin, so the c# compiler could emit this.

filipnavara commented 3 years ago

For reference, the bit @vargaz talks about is section I.8.6.1.5 in ECMA-335 specification:

Method signatures are declared by method definitions. Only one constraint can be added to a method signature in addition to those of parameter signatures:

Method signatures are used in two different ways: as part of a method definition and as a description of a calling site when calling through a function pointer. In the latter case, the method signature indicates

When used as part of a method definition, the vararg constraint is represented by the choice of calling convention.

AaronRobinsonMSFT commented 3 years ago

With the introduction of https://github.com/dotnet/runtime/issues/51156 this proposal becomes less interesting. It should be replaced with a CallConv* type instead.

filipnavara commented 3 years ago

How does the proposal in #51156 address the scenario? I fail to see it and there is no example in the linked issue.

AaronRobinsonMSFT commented 3 years ago

@filipnavara This proposal is about an attribute for Native Varargs - that approach isn't a good idea because we can instead use the UnmanagedCalleeAttribute which takes a type that indicates multiple calling convention modifiers. The concern is best described in https://github.com/dotnet/runtime/issues/48796#issuecomment-786362375. As mentioned in my comment the correct replacement for this issue is to follow it up with a CallConv* type or perhaps enable the __arglist keyword but adding a new attribute is not the correct approach. Feel free to create a new issue if a tracking issue for that is needed.