dotnet / csharplang

The official repo for the design of the C# programming language
11.59k stars 1.03k forks source link

[Proposal]: Let ref structs implement interfaces and substitute into type parameters (VS 17.11, .NET 9) #7608

Open agocke opened 1 year ago

agocke commented 1 year ago

Ref structs implementing interfaces

Related Issues

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-02-26.md#ref-structs-in-generics https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-06-10.md#ref-structs-implementing-interfaces-and-in-generics https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-07-22.md#ref-structs-implementing-interfaces

agocke commented 1 year ago

@MadsTorgersen @333fred Could we get this on a triage list? It feels like we've spent more hours in LDM working around this problem than we would actually fixing it.

huoyaoyuan commented 1 year ago

Some comments not directly-related, but interesting to consider:

What about void as a generic parameter? It's more complicated in type system, but there's not safety problem. Variables or fields declared as void can be omitted. Comparing with ref struct, the omitted void can appear in non ref-struct types.

The unconstraint can solve the gap for concrete methods, but how about delegate types? The delegate type parameters can't be constrained, should we specialize for them?

agocke commented 1 year ago

Also, @jaredpar for lifetime issues. My thinking was that every parameter would implicitly carry a "heap-safe-to-escape" lifetime, but maybe there's something I'm not thinking of. Obviously people will request more lifetime stuff in the future to workaround that, but I figured that could be an additional set of language features that we do later.

En3Tho commented 1 year ago

Can you clarify an example with where T : ~box: how does it affect boxable types like classes? It feels to me like ~box allows ref structs only?

It somehow feels to me that this is not a constraint on a type but rather a contract of a callee: a promise of no-escape-to-the-heap. A lifetime feature rather than a property of a type itself.

jaredpar commented 1 year ago

Rather than where I think we should consider allow. There are more anti-constraints that have been proposed for the language including items like allowing pointers. Having a new keyword then a ~ for them feels unnatural where as the following feels much more natural to me:

void M<T>() 
  where T : IDisposable
  allow T : ref struct { 

}

We don't need much more than the above.

Disagree. As a ref struct can satisfy the where T : ~box proposal. That means that we have to consider all T which satisfy ~box to be potentially ref struct. The proposal doesn't seem to do this but rather focuses on the boxing aspect. That means without additional restrictions code like the following would be allowed:

class C<T> where T : ~box {
  T _field;
}

C<Span<int>> // not good

That is also why I prefer allow T : ref struct because it's very clear this is the capability being allowed here and would make it much clearer why the above code fails to compile (also makes it easier for us to generate better error messages).

you just want to prevent boxing.

No. You need to prevent all behaviors that are disallowed by a ref struct. Boxing is only one of them.

agocke commented 1 year ago

No. You need to prevent all behaviors that are disallowed by a ref struct. Boxing is only one of them.

Yeah I agree with that. And I think you're right that we might want to use different syntax. But I think the basic concept is still the same here.

jaredpar commented 1 year ago

There are a couple of other items.

Essentially any T constrained to allow T : ref struct needs to have all capabilities and restrictions of a ref struct:

  1. Cannot be a field of a non ref struct
  2. Declarations can be marked scoped
  3. Is subject to lifetime tracking rules as a ref struct value would be
  4. etc ...

Effectively for all intent and purposes the compiler would treat it like a ref struct value. Cause it potentially can be and that is the worst case from a restriction point of view. No harm is done to other types by being given such restrictions.

Another other item that's come up here is how such an anti-constraint should be applied to existing APIs in the core libraries. Consider for example that every variation of Func / Action should have allow T : ref struct for all their type parameters. In those cases it's a 100% upside change.

One approach to this is to simply go through and manually update every such delegate definition to have this anti-constraint. That's manual and benefits only the developers who put in the work. Another option is to simply embed it in the language rules. Essentially for a type parameter on a delegate definition where it only appears only as the type of a simple parameter (no in/out/ref modifiers) the language will automatically infer allow T : ref struct. Doing so would make the delegate more expressive than if it didn't. I've gone back and forth on this but generally lean towards automatic inferring it here. Could be talked off this point though.

Another item is that allowing ref struct to implement interfaces means we should consider allowing [UnscopedRef] attributes on interface members. Lacking that attribute there are patterns expressible in ref struct today that cannot be expressed when factored into an interface. Think that is relatively straight forward (has meaning when implemented on a struct but not on a class) but still needs to be spec'd out.

The implication of DIM on a ref struct also needs to be worked through. Consider as a concrete example:

interface I1
{
  public object M() => this;
}

// Error: implicit box in the implementation of M
ref struct S : I1 
{
}

Think the likely conclusion here is that ref struct cannot participate in DIM cause there isn't a reasonable way to prevent all ref safety issues that can come from it. That would require both compiler and runtime support.

Need to work out the rules for co / contra variance

davidwrighton commented 1 year ago

On top of all of those details that @jaredpar brings up, I'd really like to be able to use this anti-constraint with Span<T> so that we can do a better job around dealing with certain InlineArray scenarios, but that api is just messy.

jaredpar commented 1 year ago

On top of all of those details that @jaredpar brings up, I'd really like to be able to use this anti-constraint with Span so that we can do a better job around dealing with certain InlineArray scenarios, but that api is just messy.

That particular issue is a bit of a thorny problem because there are APIs in Span<T> that are illegal once the anti-constraint is added

ref struct Span<T> : allow T : ref struct {

  // Error 1: ref field to ref struct
  ref T data;
  int length;

  // Error 2: Can't use a ref struct as an array element 
  pulbic Span(T[] array) 
}

The error 2 type of problems are very thorny. Essentially there is existing public API surface area that directly violates what the anti-constraint would provide. There isn't a general solution to this that I can envision. Think the way to move forward here would be that the compiler simply ignores them. Essentially Span<T> is a primitive and we special case certain APIs as being illegal to call when we know T is potentially a ref struct. That would classify as hard on my brain but solvable.

The error 1 type of problems are more fundamental. There is a proposal out for allowing ref fields of ref struct but it's a decently complex feature.

TahirAhmadov commented 1 year ago

I think the notion of anti-constraints needs to be considered very thoroughly. It's almost a full on massive change all on its own. What do ~class, ~struct, ~new, etc. mean? ~notnull becomes weird, because it's a double negative. What about types (classes and interfaces)? I think I saw that brought up before, but I don't remember the discussion and why it was desirable.

The proposed allow keyword is also a little finicky. Does allow only allow (pun intended) ref struct? Are there any potential other scenarios? Why not just add ref struct to the list of where - surely adding a new keyword here is overkill?

HaloFour commented 1 year ago

I think the notion of anti-constraints needs to be considered very thoroughly

I don't think they're intended to be a general purpose feature. I think they're only to be used where they enable the language to support additional functionality, but they also don't make sense as a normal constraint. A ref struct constraint doesn't force the generic type argument to be a ref struct, it forces the code of the method to follow the rules as if it could be a ref struct, but the argument could be a normal struct (or even a class?)

jaredpar commented 1 year ago

Put up a full proposal for this feature

https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md

timcassell commented 1 year ago

Another alternative that was mentioned in the old issue is placing the ref keyword at the generic declaration site, much like in and out on co/contravariant interfaces.

public void M<ref T>(T o) where T : IDisposable
    => o.Dispose();
CyrusNajmabadi commented 1 year ago

My main issue there @timcassell is that reads to me like "this must be a ref struct" as opposed to "this is allowed to be a ref-struct". That said, we'll def consider syntactic possibilities here when designing this out.

TahirAhmadov commented 1 year ago

What about where T : allow ref struct, ISomeInterface? It's kind of strange to have 2 places with constraints and "expanders" for one generic argument.

CyrusNajmabadi commented 1 year ago

The syntax here is the least interesting part :) We'll likely consider a bunch and settle on one the group feels conveys the idea the best and feels the most c#-y :)

jaredpar commented 1 year ago

@timcassell

Another alternative that was mentioned in the old issue is placing the ref keyword at the generic declaration site, much like in and out on co/contravariant interfaces.

For me that is much too easily confused with supporting the ability to have ref as a type modifier. For example List<ref int>. That is a much different feature (and not one that I think anyone is considering given the fundamental change it would require to the runtime).

What about where T : allow ref struct, ISomeInterface? It's kind of strange to have 2 places with constraints and "expanders" for one generic argument.

I agree it's a bit strange to have both allow and where but unfortunately there need to be up and down modifiers here. That is because a generic parameter today has implicit constraints (can't be pointer, ref struct, etc ...). The syntax must provide a method to remove the implicit constraints (tallow) and add more (where).

As @CyrusNajmabadi mentioned though the syntax is the least interesting part. It is also likely the part we will end up debating the most in LDM. For now I would encourage people to focus on the behaviors and ensure they satisfy the scenarios you want to get out of this feature.

TahirAhmadov commented 1 year ago

The funny thing that having read the proposal and thought about it, it kind of makes sense to me so the only remaining question that I have is the syntax :) It's just not "what looks nice", but potential future enhancements. I think we should all imagine what allow can allow (pun intended) in the future, if anything; that IMO should be the main factor to decide the syntax.

timcassell commented 1 year ago

How will calling methods on System.Object like ToString() and GetHashCode() work? From what I understand, if a struct overrides those methods, they will not be boxed when called. But if they don't override them, they will be boxed. Will calling those methods simply be disallowed? Or that would be allowed in this case since the box is short-lived and doesn't escape?

En3Tho commented 1 year ago

@TahirAhmadov I thought the same. But I don't really like "allow" syntax. I'd rather think of this not as "allow" but as "restrict the lifetime of values of certain type T".

And I would like to see some work in this direction. E.g. if I didn't want some IDisposable like native resource handle to escape I would use similar mechanism.

This can lay groundwork for more general lifetime and ownership features. I think it's important to think of them when designing syntax/rules.

IS4Code commented 1 year ago

While the idea to have <ref T> instead of allow came from me, I think ultimately allow is the best and future-consistent approach. In the end we could have allow T : static, allow T : * (pointers), allow T : ref (ref to anything) etc. if needed, so this syntax could prepare the ground for it. Indeed <ref T> does sound like an even stronger anti-constraint (allow all byrefs and byref-likes).

hez2010 commented 1 year ago

Will this also allow us to use allow pattern outside the ref struct scenarios?

For example,

T ParseValue<T>(string str) where T : IParseable<T> allow T : string
{
    if (typeof(T) == typeof(string))
    {
        return (T)str;
    }
    return T.Parse(str);
}
HaloFour commented 1 year ago

@hez2010

Will this also allow us to use allow pattern outside the ref struct scenarios?

Not without said features being explicitly designed and implemented.

jhudsoncedaron commented 1 year ago

So I just got news of this proposal. There's a major problem here: "adding instance default interface methods to existing interfaces becomes universally source+binary breaking". We depend on the ability to add default interface methods all over the place.

I'm reading this and going "really?" This breaks the primary motivating feature to have default interface methods.

I did come to an idea of how to fix it but this is probably bonkers hard: when you encounter a missing default interface method on a ref struct; jit the mehod and bail if the jit would emit a box operation.

jaredpar commented 1 year ago

This breaks the primary motivating feature to have default interface methods.

Default interface methods have always had edge cases where they do not work. For example the diamond problem where the runtime cannot pick the best DIM member results in runtime exceptions. This is just another case where DIM will have an edge case.

when you encounter a missing default interface method on a ref struct; jit the mehod and bail if the jit would emit a box operation.

I suspect that would not be very effective in practice. Most of the DIM members I've seen rely on calling other available members in the type. That implicitly uses this which forces a boxing operation on the value. This would really only work for DIM members that call into static members only and do not use this. Not sure that is a large amount.

jhudsoncedaron commented 1 year ago

"That implicitly uses this which forces a boxing operation on the value.": My testing is that doesn't box a struct. The jitter is smarter than that.

jaredpar commented 1 year ago

My testing is that doesn't box a struct. The jitter is smarter than that.

The assertion of the runtime team is that it does and that it breaks ref struct semantics.

jhudsoncedaron commented 1 year ago

How does that not break default implementations on normal mutable structs? (As in, the default implementation generates nonfunctional code because it mutates a copy)?

reflectronic commented 1 year ago

It does break them:

using System;

var s = new S();
Mut(ref s);

// Outputs '0'
Console.WriteLine(s.V);

static void Mut<T>(ref T t) where T : IMutate
{
    t.Mutate();
}

interface IMutate
{
    int V { get; set; }
    void Mutate() { V++; }

}

struct S : IMutate
{
    public int V { get; set; }
    // Uncomment to print '1' instead
    // public void Mutate() { V++; }
}

https://sharplab.io/#v2:C4LgTgrgdgPgAgJgIwFgBQ6BuBDMACAZzwF48oBTAdzwGUAKASgG50BZCYOscgM0OfToA9ELwB5DgAcORAOQAGWejhIAnHQIA6AGoCMaFQDY8cACx52wADwAVAHxdeeG3mAM8lABbluzvCDwASUtsYHJ0AG90PBjXTRCwxhY0AF9BfQBLKDCwHmwAY3IghPC0KLRYvCzgPG08CLwAc3JgJkIWtrSK2LMLDlDyRnragGoRzujY9C70AmBIfJqaf2L+sMjJmLgAZirs2uHm1vbjrsqRPABVKHyAewBbe/J94Fu8STBqvFkkWT258jYAAmmzwFx2JnMJSGDW0YwmqXQQA==

huoyaoyuan commented 10 months ago

How to solve "Tuple<T1, T2> should be ref struct when either of T1 or T2 is struct"? It may require some specification work. Will a new RefTuple<T1, T2> type be added to workaround this?

ds5678 commented 10 months ago

I have a proposal to solve the Span<Span<T>> issue.

In this comment, "unusable type" refers to any invalid type construction involving ByRefLike types.

readonly ref struct Span<T>
{
    //Warning: parameter has a potentially unusable type
    public Span(T[] array)
    {
        if (T is ref struct)
        {
            throw new NotSupportedException();
        }
        else
        {
            //Current implementation
        }
    }
}

//Warning: parameter has a potentially unusable type
//Warning: return value has a potentially unusable type
static T[] Foo<T>(T[] bar) allow T : ref struct
{
    //Error: expressions cannot target a potentially unusable type
    int length = bar.Length;

    //Error: local variables cannot target a potentially unusable type
    T[] fizz;

    //Error: expressions cannot target a potentially unusable type
    return default(T[]);
}
jhudsoncedaron commented 10 months ago

I've sat and considered for two months the default interfaces on ref structs thing. I've come to the conclusion my initial idea was more correct than anything else. Since ref structs are a new thing there's no reason a priori we can't change the rule so that default interface methods on ref structs get the actual ref struct and not a copy, so it's only going to fail if the jitter can't remove the implicit boxing given the il code of the default member. This will definitely be the case if it passes it to something expecting object, but most likely won't be the case if it accesses other interface members. Since you have the type of the interface, every such call gives the jitter more type information for locals. Note that such jitter code is far beyond my capacity to write.

jaredpar commented 10 months ago

@ds5678

I have a proposal to solve the Span<Span> issue.

The Span<Span<T>> is a different problem though from what this issue is trying to address. It is much more deeply rooted in the problems of having a ref struct as a ref field. The expand-ref proposal gets into the challenges of that. I don't have an explicit call out for Span<Span<T>> but I probably should. The TLDR of that though is that solving that problem is very hard unless you introduce explicit lifetime annotations into the language. We should probably have a separate discussion item for this.

We permit method parameters, method returns, and unbacked properties to possibly have an unusable type, perhaps with a warning message.

Dealing with the existing API set that becomes illegal once T allows ref struct is solvable, the other problems are much harder.

jaredpar commented 10 months ago

Since ref structs are a new thing there's no reason a priori we can't change the rule so that default interface methods on ref structs get the actual ref struct and not a copy, so it's only going to fail if the jitter can't remove the implicit boxing given the il code of the default member

This sounds like a discussion to have with the runtime team. Given that they told us this wasn't a solvable problem last time though I'm going on that information.

OJacot-Descombes commented 9 months ago

Wouldn’t it be possible to do a stackalloc-kind of boxing? This would allow ref structs to be boxed (and it would also allow optimization of boxing in many other cases).

huoyaoyuan commented 9 months ago

Wouldn’t it be possible to do a stackalloc-kind of boxing? This would allow ref structs to be boxed

This won't solve the problem. With escape analysis, objects with limited lifetime can be moved onto stack. However, current generic parameter allows unlimited lifetime of defined variables. We need to ensure that ref struct isn't stored into any long-lived position.

jhudsoncedaron commented 9 months ago

It's almost like ref struct : IInterface wants to be a new kind of interface. Taking existing generics that have constraint : IInterface won't work then.

And now I don't care about the feature existing.

timcassell commented 9 months ago

Wouldn’t it be possible to do a stackalloc-kind of boxing? This would allow ref structs to be boxed

This won't solve the problem. With escape analysis, objects with limited lifetime can be moved onto stack. However, current generic parameter allows unlimited lifetime of defined variables. We need to ensure that ref struct isn't stored into any long-lived position.

It wouldn't work in general, but I think it would work for this feature to allow calling Object methods that I asked about earlier (ToString, GetHashCode).

TahirAhmadov commented 8 months ago

Question: would something like below become possible?

Span<char> span = ...;
string str = string.Create(10, span, (destSpan, sourceSpan)=> { ... });
// or
string str = string.Create(10, (x, y, span), (destSpan, state)=> { ... });
jaredpar commented 8 months ago

Assuming all of the delegates are changed to have the new anti constraint and there is no capture it should be possible.

agocke commented 8 months ago

In theory I'm not seeing why we couldn't update Func and Action and simplify this whole API surface.

The new API would be

string Create<TState>(int length, TState state, Func<Span<char>, TState> func);

and we would change Func to be

delegate T2 Func<T1, T2>(T1 arg1) 
  where T1 : allows ref struct
  where T2 : allows ref struct;
timcassell commented 8 months ago

Just curious, because I doubt anyone uses it anymore, but how would delegate BeginInvoke work with ref structs?

jnm2 commented 8 months ago

@timcassell BeginInvoke throws PlatformNotSupportedException on newer .NET runtimes since .NET Core, and it sounds like the feature being discussed here would require runtime support and thus exclusively be for future .NET runtimes.

timcassell commented 8 months ago

I guess it was already possible by declaring a delegate with non-generic ref struct, so it's a non-issue for this feature.

jaredpar commented 8 months ago

In theory I'm not seeing why we couldn't update Func and Action and simplify this whole API surface.

My expectation is that these core delegates are all updated to have allows ref struct. The proposal actually speculated as to whether we should just automatically do this but ended up deciding against it. But expectation is that core types like Func, Action, IEnumerable would move to adopt this new anti-constraint.

brianrourkeboll commented 8 months ago

@jaredpar

My expectation is that these core delegates are all updated to have allows ref struct. The proposal actually speculated as to whether we should just automatically do this but ended up deciding against it. But expectation is that core types like Func, Action, IEnumerable would move to adopt this new anti-constraint.

(If this comment should be in a different place, let me know.)

I see that the proposal currently says that the allows ref struct anti-constraint is not propagated in C# and (by implication) must always be explicitly specified:

The anti-constraint is not "inherited" from a type parameter type constraint. For example, S in the code below cannot be substituted with a ref struct:

class C<T, S>
    where T : allows ref struct
    where S : T
{}

Detailed notes:

  • A where T : allows ref struct generic parameter cannot
    • Have where T : U where U is a known reference type
    • Have where T : class constraint
    • Cannot be used as a generic argument unless the corresponding parameter is also where T: allows ref struct
  • The allows ref struct must be the last constraint in the where clause
  • A type parameter T which has allows ref struct has all the same limitations as a ref struct type.

That doesn't present a huge problem in C#, since existing generic constraints already aren't propagated automatically.

Would F# need to suppress its default automatic generalization and generic constraint propagation for this specific (anti-)constraint? That would be a pretty major departure for F#.

For example, take these two existing F# functions:

let f (x : 'T when 'T : struct) = ignore x
let g x = f x

The generic constraint from f is automatically propagated to g, so that their type signatures look like:

val f : x:'T -> unit when 'T : struct
val g : x:'T -> unit when 'T : struct

Would F# need to suppress the propagation of this anti-constraint in particular? I.e., for some function g that calls a function f that has this anti-constraint, would we expect

val f : x:'T -> unit when 'T : allows ref struct
val g : x:'T -> unit // No constraint?

?

What if there were also another, regular constraint on 'T? Would the regular constraint be propagated but the allows ref struct anti-constraint be suppressed? That could get pretty confusing.

val f : x:'T -> unit when 'T :> ISomeInterface and 'T : allows ref struct
val g : x:'T -> unit when 'T :> ISomeInterface // This would be strange.

While those may be partly F#-specific design decisions, I think it is worth noting that the design of this feature in the runtime and in C# may have additional implications for F#, especially if fundamental BCL types like Func and IEnumerable are updated to use it.

jaredpar commented 8 months ago

Would F# need to suppress its default automatic generalization and generic constraint propagation for this specific (anti-)constraint? That would be a pretty major departure for F#.

Think that is a question for F# language designers. C# behavior here is essentially following how constraints are modeled in IL. There is nothing stopping F# from providing a different presentation here.

@vzarytovskii

brianrourkeboll commented 8 months ago

There is nothing stopping F# from providing a different presentation here.

Yes, but my comment was in part about how the choices that the current design makes available to F# here don't seem ideal:

But it's entirely possible that I'm missing something or haven't thought things through enough, and that consuming this from F# will be simpler than I'm making it out to be.

HaloFour commented 8 months ago

Treat this constraint differently from the way F# treats all others.

Given anti-constraints are different from other constraints, it would feel very appropriate if F# would treat them differently than it treats constraints today.

jaredpar commented 8 months ago

Treat this constraint differently from the way F# treats all others.

This is how C# treats constraints today though: they do not propagate by default. This decision is following our existing patterns.