dotnet / csharplang

The official repo for the design of the C# programming language
11.5k stars 1.02k forks source link

[Proposal]: Delegate Type Arguments Improvements #5321

Open RikkiGibson opened 3 years ago

RikkiGibson commented 3 years ago

Delegate Type Arguments Improvements

Summary

Allow a wider range of type arguments to delegates, such as the following:

void M(Action<ref int> increment) { }
void M(Func<string, out object, bool> tryParseFunc) { }
void M(Func<ref readonly int> getReadonlyInt) { }
void M(Func<int*> getPointer) { }
void M(Func<delegate*<int, void>> getFuncPointer) { }
void M(Action<TypedReference> useTypedReference) { }

delegate T MyDelegate<T>(T t);
void M(MyDelegate<ref string> myDelegate) { }

Motivation

A long-standing guideline in .NET is to use System.Action and System.Func instead of custom delegates when possible. This makes it easier for users to tell at a glance what the delegate's signature is, as well as improving convertibility between delegates created by one component and handed out to another component.

However, there have always been "holes" in System.Action/Func. One of the big ones is that only by-value parameters and returns are supported. If you want to make a delegate from a method with a 'ref' or 'out' parameter, for example, you need to define a custom delegate type for it. The same issue exists for "restricted" types such as ref structs, pointers, function pointers, TypedReference, etc.

// Can't use `Func<string, out object, bool>` here, for example.
delegate bool TryParseFunc(string input, out object value);
bool TryParse(string input, out object value) { /* ... */ }

M(TryParse);
void M(TryParseFunc func)
{
    // ...
}

It would be easier to work with delegates generally if the standard Action/Func supported a wider range of parameter and return types. We could also make it so the compiler does not synthesize delegates for lambda implicit types in many more scenarios.

It would also help lay the groundwork for further features to make using delegate types easier, such as the ability to specify delegate parameter names at the point where the delegate type is used, a la tuple field names. See related discussion.

// the following is not part of this proposal, but this proposal contributes to it:
Func<int, int, bool> compare; // old way
bool (int left, int right) compare; // new way

// we would also want to support:
Func<string, out object, bool> tryParse; // old way
bool (string input, out object parsed) tryParse; // new way

// this has a strong correspondence to the "signature" of a lambda expression with explicit return and parameter types:
var compare = bool (int left, int right) => left == right;

Detailed design

'ref' type arguments

'ref', 'in' and 'out' are permitted as "modifiers" to type arguments to delegates. For example, Action<ref int>. This is encoded using SignatureTypeCode.ByReference, and by using the InAttribute and OutAttribute as custom modifiers on the type arguments, like how such parameters are encoded on function pointers. This explicitly allows for overloading on 'ref' vs non-'ref' just as in conventional signatures.

The compiler and .NET 7 runtime both need to be modified to support this. See the prototype section for more details.

void Method(Action<int> x) { }
void Method(Action<ref int> x) { } // overloads work

The compiler will ensure that the type arguments in use are valid by checking the signature of the delegate after generic substitution. For example, we would give a compile error in the following scenario:

delegate void UseList<T>(List<T> list);
void M(UseList<ref int> useListFunc); // error: 'ref int' cannot be used for type parameter 'T' in 'UseList<T>' because 'ref int' is not a permitted type argument to 'List<T>'

It's expected that some amount of composition will still be possible:

void M(Func<Action<ref int>> makeAction) { } // ok

Other restricted type arguments

Pointers, function pointers, ref structs, and other restricted types would be generally permitted as type arguments to delegates, without the need to introduce any constraints to the delegate type parameters. The checks will occur in a similar fashion as for 'ref' type arguments.

Constraints

It seems like you could also "delay" constraint checks on usage of type parameters in delegate signatures and simply check the constraints when the delegate is used. It doesn't seem that useful, though.

interface MyInterface<T> where T : class { }
delegate void UseInterface<T>(MyInterface<T> interface); // no error?

void M1(UseInterface<string> i1) { } // ok
void M1(UseInterface<int> i1) { } // error?

It feels like this wouldn't actually present a usability benefit over requiring the declaration to specify the constraints. Let's not change the behavior here. If a delegate signature uses a type parameter, the constraints should still be checked at the declaration site of the delegate.

Why only delegates?

It's reasonable to wonder why we would only want to change the rules for delegates here, instead of allowing more interface-only types to participate, such as interfaces which don't use the type parameters in DIMs, or abstract classes which don't use the type parameters in non-abstract members. We might even wonder about whether this would be useful on some members which do have implementations.

Some of the issues with this include:

  1. A "just try constructing it and see if it works" approach is much riskier. It's expected to be able to add members to classes without breaking consumers, or to add DIMs to interfaces without breaking consumers. But if these members introduced a new usage of type parameters which causes consumers that passed ref type arguments to start getting errors, that's a problem. Also, the increased degree of indirection induced by a "check the construction" approach could make it more difficult to report meaningful errors on misuse. This means the feature would need to be driven by new type parameter constraints.
  2. Several new constraints would need to be introduced. ref would probably need to be a separate constraint from in which would be separate from out. Pointers (where T : pointer?) have different rules for their use than ref structs (where T : ref struct) which have different rules for their use than restricted types. out in particular is something that's really difficult to imagine generalizing in a useful way for members which have implementations, or even completely abstract types. (is there any interface you can imagine in .NET today which would have a sensible usage if it were given an out type argument?)
  3. It's not clear what it would mean to use a type parameter with 'ref' or similar constraints in an implementation. It feels like the basic language rules which are required to make full and sensible use of a ref start to crumble.
void M<T>(T input) where T : ref
{
    // What does this mean?
    // If T is a ref type are we actually doing a ref assignment here?
    // In non-generic contexts the language requires e.g. `other = ref input;` for this.
    T other = input;
}

The takeaway from all this is that it is still worthwhile to continue exploring a where T : ref struct constraint which could be used in any type kind, but the general "relaxation" of restricted type arguments described in this proposal probably only really makes sense for delegates.

Prototype

The compiler and runtime teams were able to create a prototype which handles some simple scenarios with 'ref' and 'out' type arguments to 'System.Action'. See the compiler and runtime prototype branches.

Using a combination of compiler and runtime changes, a small end-to-end scenario like the following was found to work as expected (writing "01" to the console):

using System;

C.M1(C.M0);
class C
{
    public static void M0(ref int x)
    {
        x++;
    }
    public static void M1(Action<ref int> action)
    {
        int i = 0;
        Console.Write(i);
        action(ref i);
        Console.Write(i);
    }
}

We were also pleasantly surprised by the fact that IL reading tools like ildasm and ILSpy handled the new encoding relatively gracefully:

ilspy-ref-type-arg

We don't anticipate any significant roadblocks with allowing other restricted types as type arguments to delegates.

Thank you to @cston from the compiler team and @davidwrighton from the runtime team for their help in making the end-to-end prototype possible.

Drawbacks

Tools which read metadata will probably need to be updated to support this. The results from ildasm and ILSpy indicate that this might not be a great deal of work. We were able to get both tools to show Action<ref int> or equivalent, but ILSpy showed Action<out int> as Action<ref int>, for example.

We also need to figure out how C++/CLI handle this: for example, will the C++/CLI compiler crash when loading an assembly that contains delegate types like this? If we move forward with this proposal we will need to give heads up so the tool can get updated in a timely manner.

Alternatives

'ref' constraint

It's possible this feature could be implementing by introducing a set of new "constraints" rather than by applying a "check the construction" approach. This is discussed in the Why only delegates section.

Do nothing

We don't gain the benefits outlined in the motivation section.

Unresolved questions

'in' vs 'ref readonly'

The language has a bit of a wrinkle where 'in' is used for readonly by-reference parameters, while 'ref readonly' is used for readonly by-reference returns. This raises a bit of a question on how a Func which both takes and returns readonly references should work, for example.

Func<in int, ref readonly int> func; // ok..?

The question is essentially: which forms would be required when the type parameter is used for a parameter or return respectively? Do we care? Do we pick one form and disallow the other? Do we allow both interchangeably and silently use the "conventional" appearance in the symbol model, etc.

The type parameter could hypothetically be used for both parameter and return, so perhaps it is best to remain flexible and allow either in or ref readonly regardless of how the type parameter is used, or to pick one of in or ref readonly and disallow the other form in type arguments.

delegate T MyDelegate<T>(T input);
void M(MyDelegate<ref readonly int x> myDelegate);

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2021/LDM-2021-10-25.md#delegate-type-argument-improvements https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-02-16.md#delegate-type-arguments-improvements

agocke commented 3 years ago

I wonder if we could simplify at all by building on ref fields in structs. Presumably we could then define

ref struct ByRef<T>
{
    ref T t;
    public ByRef(ref T t) { ... }
}

and the compiler could translate ref parameters into ByRef<T>.

Then we would just need to make ref structs work through generics.

davidwrighton commented 3 years ago

So, I think the ByRef<T> concept works as well, although it does come with some problems. It also opens up the possibility of an Out<T>, although that may be not be clean in the delegate signature compatibility rules. I think I would still prefer to see a CustomModifier used in that case if possible.

  1. Adding a ByRef<T> type isn't actually all that cheap at process load time. We may load a lot of these, if the feature becomes popular.
  2. Reflection Invoke with parameters of this type gets messy. We already have handling for byref parameters, but this would require us to duplicate it for the new ByRef<T> concept.
  3. Similarly the behavior of calling this becomes wonky. Calls would require the caller to wrap all ref parameters in this structure with a newobj. Do-able but ugly.
  4. Type Equivalence will need to be enhanced to make this work. ByRef<T> would need to support equivalence like a delegate does, and not like a normal generic structure. With ref int as a generic parameter all of that would just fall out.
  5. This would changes the delegate compatibility rules. I believe the addition would be to permit ByRef<T> and ref T to be compatible with no other additions, but I'm not quite sure. If we also needed an Out<T>, Out<T> would need to be compatible with ByRef<T>, and compatible with ref T.
  6. The JIT would need to be certain that its ABI for ref T and ByRef<T> are identical. I believe that will happen naturally on most architectures but I'm not 100% certain that will be the case on all. Some architectures have had ABIs which define that structures are always treated differently than pointers, but I don't think that includes any of the ones we support.
  7. We would still need to implement both ref type fields and generics of ByRefLike types. In the runtime, that is most of the complexity of this feature. Actually letting a ref appear in a type parameter for delegates turns out not to be complicated in either Mono or CoreCLR as long as there is no customer written IL involved. Actually supporting generics of ByRefLike types is much more complex to make safe as its also not just something that can be made safe at the runtime level, but comes with a stack of language level work.
  8. Supporting ref in this way is great, and pointers could be made to work too, but adding support for function pointers would be impractical.
davidwrighton commented 3 years ago

One detail though is that the the generic approach will almost certainly throw C++/CLI for a loop. We'll need to deal with that.

RikkiGibson commented 3 years ago

One detail though is that the the generic approach will almost certainly throw C++/CLI for a loop. We'll need to deal with that.

True. Odds are decent that the C++/CLI compiler will just crash when referencing an assembly that contains one of these types. I imagine we will also have to verify what happens when it sees ref fields.

agocke commented 3 years ago

I'm convinced. The most valuable simplification would have been if we had built ref into the type system in the beginning, instead of making it a separate type of variable. At this point ref variables already exist, so there isn't much value in erasing this one instance.

BhaaLseN commented 3 years ago

While we're looking into Action/Func improvements, would suggested parameter names (similar to how ValueTuple encodes them) be something to include here? Like:

T GenericDivision<T>(Func<T dividend, T divisor, T /* quotient */> divide)

Intellisense could then also suggest those names, both for lambdas and "generate method" refactorings.

Aside from the uncertainty of whether it should be in here (or in its own Discussion/Proposal,) the one thing I'm 50/50 on is the naming of the return value for consistency (even though the encoded name could be used to suggest a variable name if the result of the delegate invocation is stored somewhere.)

RikkiGibson commented 3 years ago

@BhaaLseN There is a brief discussion of how this could work in the bottom of the "Motivation" section of the proposal.

Func<T, T, T> divide; // old way
T (T dividend, T divisor) divide; // new way

The goal would be to bring the "expressiveness" of the types of lambda expressions up to par with the expressiveness of other kinds of function signatures as much as possible. It probably would not include the ability to name the return value since other places where we declare functions don't provide this.

BhaaLseN commented 3 years ago

Oops, it looks like I read it but by the time i got to the end I had forgotten about it :D Sorry for the noise.

TahirAhmadov commented 3 years ago

With all due respect, I completely disagree with the whole premise of this idea. Func<> and Action<> are terrible anti-patterns, probably other than the simples Action and Func<T> variants. A descriptive delegate name tells you much more about the purpose of it. Convertibility is a bug, not a feature: you don't want bugs caused by accidentally passing in a delegate meant for one thing to a receiver which expects a delegate for another purpose, but has a matching signature.

On top of this, allowing these new syntax variants for generic types would need to work with all generic arguments, including regular classes and stuff. Otherwise, we have this dual feature - one style of generics for delegates and another for everything else.

Sergio0694 commented 3 years ago

Somewhat related question, as a follow up to possibly supporting refs to refs if we do get support for ref types as generic arguments for delegate types: is this feature as a whole also tracking adding proper support for refs to refs in method parameters as a whole? If not, would it make sense to open a separate proposal for that?

For instance, this would allow methods like this:

bool TryGetRef(TKey key, out ref TValue value);

This would also be possible through an out ByRef<TValue> param, but having proper syntax would be much better 😄

RikkiGibson commented 3 years ago

This proposal was written with the intent to sidestep the concept of refs to refs. Even if we added the ability to pass ref arguments to type parameters in general we still might sidestep this concept.

That said I would be interested to hear any arguments in favor of refs to refs and to hear more about your use case.

Sergio0694 commented 2 years ago

For instance, consider these two APIs we've added to the CollectionsMarshal type in .NET 6 (issue):

public static class CollectionsMarshal
{
    /// <summary>
    ///  Gets a reference to the value associated with the specified key, or returns Unsafe.NullRef<TValue>
    /// </summary>
    public static ref TValue TryGetValueRef<TKey, TValue>(Dictionary<TKey, TValue> dictionary, TKey key, out bool exists);

    /// <summary>
    ///  Gets a reference to the value associated with the specified key, or inserts it with value default(TValue).
    /// </summary>
    public static ref TValue GetValueRefOrAddDefault<TKey, TValue>(Dictionary<TKey, TValue> dictionary, TKey key, out bool exists);
}

They were unable to follow the proper TryFoo pattern in C# and instead settle for this arguably extremely clunky signature due to the inability to return a ref to a ref. Ideally you could write equivalent APIs but with a more intuitive signature, ie:

public static bool TryGetValueRef<TKey, TValue>(Dictionary<TKey, TValue> dictionary, TKey key, out ref TValue value);

public static bool GetValueRefOrAddDefault<TKey, TValue>(Dictionary<TKey, TValue> dictionary, TKey key, out ref TValue value);

Which would allow you to use them with the familiar and idiomatic try-something pattern:

if (CollectionsMarshal.TryGetValueRef(map, out ref value))
{
    // Do something with the value ref
}

Now, this is just an example, but I can see something like this being generally useful as another option for people writing more lowlevel APIs. It's not unlike having methods taking T** (which is relatively common in lowlevel APIs), with the extra flexibility here that you could also use this on non-pinned target locations. This is arguably niche and might not be worth a separate proposal, but if this is something that could just "come naturally" from a general extension to type arguments in general without too much specific work, it would certainly be nice to have in some scenarios 😊

(joking for a second, I'll also say that of course given that resources are limited, if the decision was made to extend generics in general and I could choose, I'd take being able to use pointers and ref structs in generics over this any day, because right now those limitations are an absolute chore to work around and generally much, much more impactful than this for devs 😄)