dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.89k stars 4.01k forks source link

Improve the incomplete generic type constraint implementation #13488

Closed MrMatthewLayton closed 7 years ago

MrMatthewLayton commented 8 years ago

Constraint against multiple potential types (OR logic)

Rationale I have two classes; one of them performs operations for floating point numbers, the other performs operations for integral numbers. Currently I am unable to constrain my classes to use only floating point types or integral types of my choice

Suggestion

class FloatingOps<T> where T : (float | double | decimal)
{
    ...
}
class IntegtralOps<T> where T : (sbyte | byte | short | ushort | int | uint | long | ulong)
{
    ...
}

Problem Identification Given FloatingOps<T> where T : float | double | decimal a proplem arises in that each type implements different features; for example, double has static members Epsilon, NaN, IsNaN, etc. and decimal has MinusOne, One, FromOACurrency, etc.

Solution Restrict by constraint, the ability to access members that are not available in all constraint types.

Constraint against multiple potential types (OR logic) (part 2)

Rationale I have a generic class of T where I would like T to either derive from FooBaseand IDemoor BarBaseand IDemo

Suggestion

class FooOrBar<T> where T : (FooBase | BarBase), IDemo
{
    ...
}

Constraint against operators

Rationale I have a generic class where I would like to use operators, which is not currently supported. This would be good for equality of T where T has value semantics

Suggestion

class OperatorOps<T> where T : ==(T a, T b), !=(T a, T b)
{
    ...
}
class GreaterOps<T> where T : >(T a, T b), >=(T a, T b)
{
     ...
}

Important note This implementation would allow the developer to constrain by specification, the arguments for the operator. Consider for example, implicit/explicit conversion and overloaded operators.

Example

class BlackOps<T> where T : ==(T a, int b), ==(T a, Foo b), ==(T a, Bar b), ... etc.
{
    ...
}

Constraint by member

Rationale I am using two classes from two different vendors, thus unrelated, but share a member or method signature. Constraint by member would allow me to relate these classes by member where they are not related by base class / interface

Suggestion

class A
{
    public int Value { get; set; }
    public void DoSomething();
}
class B
{
    public int Value { get; set; }
    public void DoSomethingElse();
}
class C
{
    public int Item { get; set; }
    public void DoSomething();
}
class DoesSomething<T> where T : void DoSomething()
{
    ... // T can be class A or class C in this context.
}
class ValueMachine<T> where T : int Value { get; set; }
{
    ... // T can be class A or class B in this context.
}

Constraint directives

Rationale I have a class with some very long constraint semantics, I would like to be able to define this underneath my using directives so as not to clutter my file up with constraint semantics

Example

class Convoluted<TKey, TValue, TKeySelector>
    where TKey : class, new(), IEquatable<TKey>
    where TValue: struct, IComparable<TValue>
    where TKeySelector : class, new(), ISomething, ISomethingElse
{
    ...
}

Suggestion

namespace MyAwesomeCode
{
    using System;
    using System.Linq;
    using where TKey : class, new(), IEquatable<TKey>
    using where TValue: struct, IComparable<TValue>
    using where TKeySelector : class, new(), ISomething, ISomethingElse

    class Convoluted<TKey, TValue, TKeySelector>
    {
        ...
    }
}

Problem Identification If the file contained two classes with generic parameters with the same names, then the constraints would map to both of those classes, given that they are declared higher in scope.

Solution When using constraint directives, restrict the file to single class/struct only, or, simply allow multiple class declarations in the file to share constraint directives, unless explicitly overridden on the class/struct.

pebezo commented 8 years ago

Sounds like this would be a lot easier / nicer if C# would have TypeScript's union types (https://www.typescriptlang.org/docs/handbook/advanced-types.html)

MrMatthewLayton commented 8 years ago

@pebezo You're not wrong, great idea! Have you Seen The Future of C#? and have you noticed how C# is becoming very JavaScript'esque in some respects (they even say that in the video)

Now on that note, a few days ago I tweeted Microsoft & Anders Hejlsberg, suggesting that a single unified language which encompassed everything TypeScript was doing, and everything C# was doing, but could target the front-end and the back-end simultaneously by means of the CLR, or "transpiling" to JS, this could solve a multitude of problems - Imagine for example, a development team that don't depend on both C# and JavaScript/TypeScript developers - instead they use a single language for everything...wouldn't that be great?

I even gave them a name for such a language "T#" (Turing Sharp), cleverly combining C# and TS into one, but more importantly, in homage to Alan Turing, the father of modern computing.

They didn't take me seriously :-(

HaloFour commented 8 years ago

This duplicates a number of individual proposals. Specifically constraints by member or operator (which is really just a static member anyway) have been proposed. As have constraints for primitive types. I don't think that there are any proposals for union types, though.

Note that C# is kind of limited in what it can really do here. For the most part C# implements every form of generic constraint permitted by the CLR (effectively everything except enum and delegate). The CLR offers no metadata, enforcement or support for anything beyond that.

If C# were to "fake" these extra constraints, like F# does, there are two massive problems:

First, the CLR can't really enforce it. At best C# could emit some custom metadata which supporting compilers would have to recognize and enforce. Anyone using a compiler that doesn't do this would be able to compile code that would use the generic in an incorrect way. Since the CLR can't enforce the constraint this would work just fine and you wouldn't get a TypeLoadException. Depending on the IL actually used the methods might even succeed with unexpected results. Which leads to:

Second, the IL emitted for any generic type or member has to be exactly the same regardless of the possible generic type arguments. For members this is problematic as the declaring type is a part of the signature. This is fine for base class or interface constraints as the declaring type is known and the method call is handled through virtual dispatch. But for arbitrary members there is no known type through which to dispatch the call. For operators this gets even more complicated as sometimes operators map to static members and other times operators map to IL instructions. T + T means completely different things depending on whether T : int or T : string or T just happens to have an overloaded + operator.

Just to note, I'm not arguing against the features. I think that they would be useful. But I think that they cannot be implemented properly in C# without some serious modifications to the CLR.

ghost commented 8 years ago

So, considering that C# cannot do certain (or a lot of) things because of the CLR, then the question is when and how the CLR will evolve again. Would be interesting to see a blog or "Build 2017 session: The future of the CLR" about that.

MrMatthewLayton commented 8 years ago

@nmarcel I agree. Either the CLR needs to evolve, or in the mean time, perhaps Roslyn could pick up the slack with some clever sugar.

ghord commented 8 years ago

First, the CLR can't really enforce it. At best C# could emit some custom metadata which supporting compilers would have to recognize and enforce. Anyone using a compiler that doesn't do this would be able to compile code that would use the generic in an incorrect way. Since the CLR can't enforce the constraint this would work just fine and you wouldn't get a TypeLoadException.

This could easily be done by compiler emitting constraint checking code in static constructor of generic type:

public class MyGeneric<T> where T : myconstraint
{
    static MyGeneric<T>()
    {
         //check if T fullfils myconstraint and throw TypeLoadException if it does not.
    }
}
MrMatthewLayton commented 8 years ago

Given the limitations of the CLR in terms of the proposed features, would it be possible to implement them using, essentially Roslyn and syntactic sugar? .NET already does this in several respects

...other examples are available.

I guess the major underlying question in this respect is: Is the CLR responsible for enforcing the constraint, or could we bypass the CLR using Roslyn, syntactic sugar and perhaps some compiler generated patterns? With that in mind, let's continue...

I had a few thoughts on how the following issues could be solved:

Consider the following code example

class ValueComparer<T> : IEquatable<ValueComparer<T>>
{
    public T Value { get; private set; }

    public ValueComparer(T value)
    {
        Value = value;
    }

    public bool Equals(ValueComparer<T> other)
    {
        return Value == other.Value;
        // Operator '==' cannot be applied to operands of type 'T' and 'T'
    }
}

Notice that we can't use operators on generic type parameters, therefore, we could use a constraint to bring the operator into scope.

class ValueComparer<T> : IEquatable<ValueComparer<T>> where T : ==(T left, T right)
{
    public T Value { get; private set; }

    public ValueComparer(T value)
    {
        Value = value;
    }

    public bool Equals(ValueComparer<T> other)
    {
        return Value == other.Value;
    }
}

In this case, Roslyn would need to check that '==' is implemented on 'T', and therefore allow the use of it. But you're wondering: "Yes, but as we've discussed, the CLR can't do that!"

In this case, the constraint is simply a flag for the compiler to generate some code to allow '==' to be called. The code above in this context would be valid where '==' is implemented on 'T', and invalid, and subsequently throw an error when '==' is not implemented on 'T'

Before I forget, the reason I chose this syntax ==(T left, T right) is so that the developer can choose operator types against their constraint, for example:

So, what might we expect from Roslyn in order to implement the constraint? It's already done half of the job, and that was to check that '==' is implemented on 'T'. The second half of the problem is to generate some code to allow '==' to be called.

Here is a static factory that would be generated by the compiler, which allows delegates to be created based on the operator being called.

Consider the following example, which implements a delegate to call '==' for int.

static class OperatorFactory
{
    public static Func<T, T, bool> GetEqualityOperator<T>()
    {
        if(typeof(T) == typeof(int))
        {
            return new Func<int, int, bool>((left, right) => left == right) as Func<T, T, bool>;
        }

        return ...;
    }
}

Technically, code should never fail because the compiler has done the work in ensuring that a delegate exists for each operator call required.

The code written by the developer would need to change slightly too. Consider TypeScript to get your head around this, where a lot of it's features are superimposed on top of JavaScript, and are compiled away into pure JavaScript. Essentially the same thing would happen here, where the constraint would simply become pure C#, which is already compatible with the CLR.

class ValueComparer<T> : IEquatable<ValueComparer<T>> // where T : ==(T left, T right) Compiler removed because it's not "really" a constraint.
{
    // Compiler generated
    private static Func<T, T, bool> equalityOperator = OperatorFactory.GetEqualityOperator<T>();

    public T Value { get; private set; }

    public ValueComparer(T value)
    {
        Value = value;
    }

    public bool Equals(ValueComparer<T> other)
    {
        // Compiler removed due to Operator '==' cannot be applied to operands of type 'T' and 'T'
        return Value == other.Value;

        // Compiler generated
        return equalityOperator(Value, other.Value);
    }
}

In summary, here's what the compiler does:

In summary, yes, C# would essentially fake the constraint in this case, however without modification to the CLR, this can only be done by the compiler.

MrMatthewLayton commented 8 years ago

@ghord sort of what my rather long comment was getting at. except it would not need to expose any metadata to other compilers since it's just a compiler built factory pattern, implemented under the hood, with things that C# can do already.

Although on that note, other C# compilers would not recognise the constraint, so it would be a but like trying to write C# 6/7 code in Visual Studio 2010

HaloFour commented 8 years ago

@nmarcel @series0ne

It was created early on and hasn't really been updated since, but check out #420. Making CLR modifications to enable new language features is certainly on their radar. It just increases the cost associated with any given language feature proposal. The CLR evolution isn't likely to happen in a vacuum, though, and I'd expect that you'd hear about it through a language adopting a feature that now requires a new version of the CLR.

@ghord

Maybe. Certainly the C# compiler could emit code to enforce the constraint at runtime, but to have a program fail at runtime due to improper use of generics is far from a good experience. The bigger issue, by far, is the IL equivalence. The C# compiler could emit some tricky code to emulate the behavior but it would be far from ideal and far from efficient. This is what F# does today.

For C#, the flagship language of the CLR, I'd rather see it done right. You're probably looking at C# 9.0 for consideration of this with or without CLR changes anyway.

jnm2 commented 8 years ago

Just imagine the dotPeek decompiled output 😆

MrMatthewLayton commented 8 years ago

This article about F#'s constraints on MSDN lists several of the constraints I would like to see in C#, so as suggested, the F# compiler already creates emulated workarounds for this. With that in mind, I'm leaning towards, "The CLR needs updating" rather than "Let's emulate these in C# too"

qrli commented 8 years ago

I'd like to add: such feature is not necessarily limited to generics (as it is in current C#). It could be more general, like C++'s concept or Scala's type trait. It also helps solving the last issue in OP. e.g.

interface IKeySelector : class, new(), ISomething, ISomethingElse;
class Convoluted<TKey, TValue, TKeySelector>: where TKeySelector: IKeySelector
object SelectKey(IKeySelector keySelector) { ... }
Thaina commented 7 years ago

related #2146 #3255

gafter commented 7 years ago

We are now taking language feature discussion in other repositories:

Features that are under active design or development, or which are "championed" by someone on the language design team, have already been moved either as issues or as checked-in design documents. For example, the proposal in this repo "Proposal: Partial interface implementation a.k.a. Traits" (issue 16139 and a few other issues that request the same thing) are now tracked by the language team at issue 52 in https://github.com/dotnet/csharplang/issues, and there is a draft spec at https://github.com/dotnet/csharplang/blob/master/proposals/default-interface-methods.md and further discussion at issue 288 in https://github.com/dotnet/csharplang/issues. Prototyping of the compiler portion of language features is still tracked here; see, for example, https://github.com/dotnet/roslyn/tree/features/DefaultInterfaceImplementation and issue 17952.

In order to facilitate that transition, we have started closing language design discussions from the roslyn repo with a note briefly explaining why. When we are aware of an existing discussion for the feature already in the new repo, we are adding a link to that. But we're not adding new issues to the new repos for existing discussions in this repo that the language design team does not currently envision taking on. Our intent is to eventually close the language design issues in the Roslyn repo and encourage discussion in one of the new repos instead.

Our intent is not to shut down discussion on language design - you can still continue discussion on the closed issues if you want - but rather we would like to encourage people to move discussion to where we are more likely to be paying attention (the new repo), or to abandon discussions that are no longer of interest to you.

If you happen to notice that one of the closed issues has a relevant issue in the new repo, and we have not added a link to the new issue, we would appreciate you providing a link from the old to the new discussion. That way people who are still interested in the discussion can start paying attention to the new issue.

Also, we'd welcome any ideas you might have on how we could better manage the transition. Comments and discussion about closing and/or moving issues should be directed to https://github.com/dotnet/roslyn/issues/18002. Comments and discussion about this issue can take place here or on an issue in the relevant repo.

The features requested here would be addressed by type classes, which are under consideration at https://github.com/dotnet/csharplang/issues/110