dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.04k stars 4.03k forks source link

Proposal: Intersection types #4586

Closed MouseProducedGames closed 6 years ago

MouseProducedGames commented 9 years ago

Let's say you have an IQuadruped interface:

public interface IQuadruped
{
    public Leg FrontLeftLeg { get; set; }
    public Leg FrontRightLeg { get; set; }
    public Leg BackLeftLeg { get; set; }
    public Leg BackRightLeg { get; set; }
    // ...
}

and an IMammal interface:

public interface IMammal
{
    public NumberOfHairs { get; }
    // ...
}

And, of course, some classes implementing both interfaces.

public class Dog : IQuadruped, IMammal, ... { // ... }
public class Cat : IQuadruped, IMammal, ... { // ... }
public class Platypus : IQuadruped, IMammal, IReallyWeird, ... { // ... }
// ...

Now you want to operate on all the quadrupedal mammals in your enumerable of animals:

foreach (var animal in animals.Where(a => a is IQuadruped && a is IAnimal))
{
    IQuadruped qAnimal = (IQuadruped)animal;
    IMammal mAnimal = (IMammal)animal;
    // Code involving both interfaces here.
}

And, of course, the complexity goes up the more interfaces you want to check.

However, if, instead, you had:

foreach (var animal in animals.Select(a => a as I<IQuadruped, IAnimal>).Select(a => !object.Equals(a, null))
{
    // animal is now of type I<IQuadruped, IAnimal>.
    // Any operation supported by either interface is supported by the combined interface.
    // And you only have to reference one variable.
}

In short, I<Interface, ...> combines interfaces into one interface, without having to create an IQuadrupedMammal interface, going back into your code, and refactoring. Which may not be possible, or may be impractical. IQuadrupedMammalReallyWeirdFromAustralia...

MouseProducedGames commented 9 years ago

I used IAnimal a few times instead of IMammal. Doesn't change the point or anything, but I don't know how to go back and edit out those word errors.

Corey-M commented 9 years ago

This can be done by creating composite interfaces and applying them to the classes:

public interface IQuadrupedMammal : IQuadruped, IMammal
{
}

public class Dog : IQuadrupedMammal { /*...*/ }

for (var animal in animals.OfType<IQuadrupedMammal>())
{
}

Unfortunately this would require you to create such a composite for every combination of interfaces you want to test/use and apply all of the relevant composites to every qualifying class. Especially where you have overlapping composites this has the potential to get out of hand very quickly and certainly damages maintainability.

Perhaps a better alternative is to allow duck typing on interfaces, either in general or by the introduction of a new interface type: implicit interface. I have no idea whether this is possible without changes to the CLR.

aluanhaddad commented 9 years ago

I think this is a special case for a general feature: Expressing intersection types. I don't think the proposed syntax would work particularly well, as it would clash with the syntax for generics. How about

IQuadruped with IMammal with IAnimal

a la Scala or

IQuadruped & IMammal & IAnimal

a la TypeScript?

As a side note, this would be very useful in the context of the proposed pattern matching feature.

alrz commented 9 years ago

once again, conjunctive patterns would solve this.

if(obj is IQuadruped and IMammal and IAnimal) {}

although, if you want variables, you should define one per interface

foreach (var animal in animals) {
    if(animal is IQuadruped q and IMammal m and IAnimal a) {
        ...
    }
}

I think this is more practical than "type intersections" since C# doesn't allow such syntax in catch clause neither.

aluanhaddad commented 9 years ago

@alrz the problem with

you should define one per interface

e.g.

if (animal is IMammal m and IQuadruped q and ICarnivore c) { ... }

Is that it is insufficient for many cases. You would frequently want to have a single variable.

For example suppose I have a method

void ProcessCarnivorousQuadruped<T> (T cq) where T: ICarnivore, IQuadruped { ... }

There is no way to call it from the if block.

At any rate I do not think that the pattern matching proposal needs to cover intersection types. Rather, I think the ability to specify intersection types, as variables, fields, etc. would naturally enrich pattern matching by implication.

alrz commented 9 years ago

The nice thing about conjunctive patterns is that since they are defined under pattern they would be used in a wider context, while you can already use generic constraints for that case:

void F<T>(IEnumerable<T> animals) where T :  IMammal, IQuadruped, ICarnivore {
    foreach(var animal in animals) {
        ...
    }
}

VB Syntax is even more closer to what you've suggested:

Sub F(Of T As {IMammal, IQuadruped, ICarnivore})(animals As IEnumerable(Of T))
    For Each animal In animals
        ...
    Next
End Sub 

What does type intersections do that can't be done with generic constraints?

aluanhaddad commented 9 years ago

@alrz the ability to write

void F<T>(IEnumerable<T> animals) where T :  IMammal, IQuadruped, ICarnivore {
    foreach(var animal in animals) {
        ...
    }
}

How do I pass an argument to F? e.g. how does this work?

object x = ...
switch (x)
{
    case IMammal m and IQuadruped q and ICarnivore c:
    F(???);
}

Generic constraints are wonderful for intersecting multiple types for the callee, but don't do anything for the caller.

alrz commented 9 years ago

You shall pass the x itself, which (I suppose) is of some type that implements all three interfaces.

aluanhaddad commented 9 years ago

Right, but how do you pass it. It cannot be cast, since there is again no way to refer to the type. Dynamic would work but that is not the intention.

Also, there are other advantages to being able to write intersection types. Suppose a future version of the language evolves to support Scala style instance level mixins, this would provide a natural way of referring to them.

alrz commented 9 years ago

You mean it's an object? How would that get there? I would declare an overload for enclosing method of switch so it will be known that it implements those interfaces. If it's upcasted to object what is the point of these interfaces at first place?

aluanhaddad commented 9 years ago

The idea is you have some general animal handler function like

void ProcessAnimals(IEnumerable<IAnimal> animals) { ... } 

So the type may not be object, but it may not be specific enough. Furthermore, and this was a key point made by @MouseProducedGames in proposing this, suppose you do not have an ICarnivorousMammalianQuadruped interface in the first place.

Even if you did, overloading the enclosing method does not work in any case because you could not pass an IAnimal to it. If you already knew the type statically to the point where you could rely on overload resolution, then using the pattern matching switch would be redundant.

alrz commented 9 years ago

Well, that is not what you would get from polymorphism. "A polymorphic type is one whose operations can also be applied to values of some other types". If these operations vary from type to type, then you should use virtual dispatch or method overloads to specialize these operations for a specific type.

So, in your example, ProcessAnimals shall be overloaded for operations that cannot be done to a general animal. And the compiler would infer appropriate method from the target type.

aluanhaddad commented 9 years ago

@alrz then what is the justification for pattern matching on a type at all? Single dispatch object oriented polymorphism has certain inherent limitations. This is one reason why solutions such as pattern matching, the Visitor Pattern, and many others exist. You are basically saying that there are never valid situations where you do not know the type at compile time but you want to treat it differently depending on it's runtime type.

alrz commented 9 years ago

When you're using record types in patterns, that is called pattern matching, because you're actually matching a pattern like Expr(Const(0), Const(1)) against an object, but what you're doing here is just a type test which of course is base of pattern matching. By the way, F# is using a different syntax for type tests:

match x with
| :? IMammal as m
& :? IQuadruped as q
& :? ICarnivore as c -> ...
aluanhaddad commented 9 years ago

But the example does not involve a sum type that can be decomposed into different components. Expr(Const(0), Const(1)) is composition not inheritance (the fact that Const implements Expr is irrelevant in this case).

Suppose I have a hierarchy

interface IWord { string Text { get; } }
interface INoun : IWord { IVerb SubjectOf { get; } }
interface IVerb : IWord { INoun: Subject { get; } }

We can agree that nouns and verbs are words, but there are additional properties that are not common between them such that they can be extracted into the base type. Nevertheless it may be valuable to have an IEnumerable<IWord>.

alrz commented 9 years ago

I have a hierarchy

Yeah, that's inheritance.

Why don't you do this:

void F(IEnumerable<IWord> words) {
    foreach(var word in words) {
        switch(word) {
            case INoun n: ... // operations to nouns
            case IVerb v: ... // operations to verbs
        }
    }
}

If these operations are exactly the same, you should do it on the base IWord.

aluanhaddad commented 9 years ago

But now, imagine I define the following

interface IPresentPartciple : INoun, IVerb { }

Of course I can define that interface, but there are other types that implement both interfaces but are not present participles and I would like to be able to treat them the same way. With generic constraints I can define a method that processes anything that implements both INoun and IVerb, but I cannot call such a method without casting a word to a specific interface. Which interface is that?

alrz commented 9 years ago

If there is a "same way" so they both should be derived from the same type. If your class/interface hierarchy was well designed, you wouldn't encounter these problems at all.

So this is very rare, but if it happens, conjunctive patterns.

ufcpp commented 9 years ago

I like this concept, but IMO, its syntax should be consistent with that of generic constraints. How about the following:

x is T t where T : IQuadruped, IMammal
aluanhaddad commented 9 years ago

@alrz In many cases, that would be symptomatic of an ill thought out hierarchy, but sometimes you have multiple axes of inheritance that cut across one another due to the semantics of the domain.

At any rate deriving such combinations from a base type, here it could be INounAndVerb, is possible, even easy, but it is unpleasant because you have to define and extend an interface for every combination. That aside, conjunctive patterns, as you propose them, do not suffice because they do not give you a single reference to the matched value.

Also the number of combinations can get out of hand quickly. Going back to the original example:

interface IAnimal { }
interface IReptile : IAnimal { }
interface IMammal : IAnimal { }
interface ICarnivore : IAnimal { }
interface IMammalLikeReptile : IReptile, IMammal { } // of paleontologic interest
interface ICarnivorousReptile : IReptile, ICarnivore { }
interface ICarnivorousMammal : IMammal, ICarnivore { }
//.. and on and on 

@ufcpp that seems quite nice.

alrz commented 9 years ago

Which ones you want to combine then?

aluanhaddad commented 9 years ago

Quite possibly all of them, but it will evolve as functionality increases. if I add

interface ICarnivorousMammalLikeReptile : IMammalLikeReptile, ICarnivore { }

I want to be able to match the type of carnivorous reptiles against it. Which requires me to write

interface ICarnivorousMammalLikeReptile : 
    IMammalLikeReptile, 
    ICarnivorousReptile, 
    ICarnivorousMammal { }
alrz commented 9 years ago

What kind of operation that would it be? That doesn't make sense at all. You are just mentioning types and combinations. Not operations.

aluanhaddad commented 9 years ago

How does it not make sense? Perhaps I have some logic that takes into account aspects of reptiles in determining their likelihood of successfully catching prey. It's an example, but there are use cases. If conjunctive patterns make sense, then it makes sense for something to implement a variety possibly disparate interfaces, so the combinations make sense by implication.

Anyway, there are languages that have this feature, for example TypeScript using syntax

T & U

and Scala using syntax

T with U
alrz commented 9 years ago

of reptiles

So the method your are looking for would be defined in the IReptile interface. Hence, you won't need any combination.

aluanhaddad commented 9 years ago

Negative it would be a generic method

AnalyzeReptilianHuntingBehavior<T>(T hunter) where T: IReptile, ICarnivore { ... }

And I may pass an ICarnivorousMammalLikeReptile or an ICarnivorousReptile to it by using pattern matching or type testing to determine that the actual value implements both interfaces but not needing to know which one combination interface it implements. Otherwise, I have to cast and the cast may fail so I have to cast against trying all known combinations, and it will break if new combinations area added.

alrz commented 9 years ago

You are just twisting the question without knowing your hierarchy. See, carnivorousness is not even in the classification, it's a property of a general animal. That's where you have to use composition.

aluanhaddad commented 9 years ago

I just named it that for readability. ICarnivore is in the hierarchy. Why do you presume it is a property of a the base type? Anyway, I fail to see the resistance to the example given that you are proposing

if (a is IReptile r and ICarnivore c) { ... } 

which implies the combination can exist. I am simply proposing a way to refer to the matched value's aggregate type implied by the success of the pattern. So

if (a is IReptile and ICarnivore rc) { ... } 

but I think this is more general than pattern matching. The point you initially brought up about generic constraints validates this. It's just from the callers point of view.

alrz commented 9 years ago

There is no combination, it's just two individual type tests.

aluanhaddad commented 9 years ago

This is about having cross cutting classifications for types along multiple axes and the ability to express that to perform different operations depending on different combinations. My point is that if it makes sense as a constrained generic type, and if it also makes sense as a pattern, then it makes sense for there to exist some value of said type and to be able to refer to its type. I fail to see the problem.

alrz commented 9 years ago

ok how would type intersection syntax (if it's intended to be used as pattern) doesn't conflict with conjunctive patterns? The former is only valid in type tests (to be more precise, just interfaces) while the latter works with all kinds of patterns (as long as they're pattern compatible).

aluanhaddad commented 9 years ago

That's a fair question. I would argue that both features have value and are complementary. As far as syntax, I have not proposed one. There could easily be a syntax for type intersection that does not conflict in any way with conjunctive patterns. I merely wrote

if (a is IReptile and ICarnivore rc) { ... } 

to illustrate that the validity of conjunctive patterns implies the semantic validity of intersection types. @ufcpp proposed a syntax that would not conflict but it has other issues. I don't think the syntax matters so long as it does not conflict.

alrz commented 9 years ago

I don't know if this is ambiguous with #5402, since no lexical grammar mentioned in the suggestion.

I think there is already plans for these. see the #6212 comment (briliant discussion by the way).

aluanhaddad commented 9 years ago

I'll try to come up with a lexical grammar and propose it. May I suggest you create a separate proposal for conjunctive and/or disjunctive patterns. #5402 has a lot of ideas and implications and I see conjunctive and disjunctive patterns as orthogonal to expression forms for statements. (Yes, discussion is excellent)

alrz commented 9 years ago

@aluanhaddad That wasn't part of it at first, but conversation goes on and the proposal changes several times. It's fixed now.

HaloFour commented 9 years ago

I'd think that intersection should be a type pattern unto itself rather than trying to shoe-horn it into "conjunctive patterns" with odd variable declaration.

if (x is (IDisposable && IFormattable) y) {
    Console.WriteLine(y.Format(format, null));
    y.Dispose();
}

This would be syntactically equivalent to:

if (x is IDisposable y1 && x is IFormattable y2) {
    Console.WriteLine(y1.Format(format, null));
    y2.Dispose();
}

The compiler would emit a local for each type, assign them all to the same instance and silently switch between the locals depending on the context of its use. I initially thought that this might be an issue, especially if y was passed to a method as ref, but since pattern variables are readonly I don't think that would be an issue.

What would likely have to be resolved is the new typing that this would introduce. To my knowledge C# has no concept of treating a variable as a set of multiple types. The closest you'd get is with generic constraints but even then the compiler/tools know the variable as the generic type, not as the separate constraint interfaces.

alrz commented 9 years ago

@HaloFour Exactly, this cannot compared to generic "constraints", this would be a new kind of type, I guess.

aluanhaddad commented 9 years ago

@HaloFour exactly my point. This is the motivation for intersection types as a separate language feature. As you say it would require a new kind of type construct. If they were only available as local declarations the compiler could generate a type behind the scenes which implements the intersected types with methods that delegate to the actual value, but there are a number of details that need to be worked out. I think it would be worthwhile.

lmcarreiro commented 8 years ago

What happens in the case that two interfaces when both of them contains a member with the same name, but with different implementations (explicit implementations)?

I think that it would be complicated, even if the implementations are the same (implicit implementation), because the member gets duplicated in the class's method table... at runtime there is no difference, CLR doesn't know if the implementations are the same or not.

HaloFour commented 8 years ago

@lmcarreiro

Personally I'd go with overload resolution in the order in which the intersected interfaces are defined:

public interface IFoo {
    void Hello();
}

public interface IBar {
    void Hello();
}

public class FooBar : IFoo, IBar {
    public Hello() {
        Console.WriteLine("Foo!");
    }

    void IBar.Hello() {
        Console.WriteLine("Bar!");
    }
}

(IFoo && IBar) test1 = new FooBar();
(IBar && IFoo) test2 = new FooBar();

test1.Hello(); // prints Foo!
test2.Hello(); // prints Bar!

I can appreciate that there may be some confusion that IFoo && IBar and IBar && IFoo wouldn't behave identically, though.

alrz commented 8 years ago

This situation is currently reproducible,

interface A {void M();}
interface B {void M();}

void M<T>(T t) where T : A, B => t.M(); // ERROR

The only place that I've encountered in which order becomes significant is with covariants (example). In that case, the compiler can not possibly know about the ambiguity.

Pzixel commented 7 years ago

It would be a great feature. For example I'm using some code-generating method which returns class for specific inerface:

public T GenerateMyClass<T>(){...}

But if my inner class is for example IDisposable I have only two possibilities

  1. Just implement it and return base interface. It should be manually casted to IDIsposable to free resources or
  2. I have to force my users to inherit IDisposable interface themselfs by using where T : IDisposable even when T should not be IDisposable.

With this feature it would be easy to write:

public T & IDisposable GenerateMyClass<T>(){...}

which would accomplish it without any cons.

TonyValenti commented 7 years ago

I really like the following syntax for declaring variables:

var x is IAnimal, IMammal = Something();

if(y is IAnimal, IMammal Z){
    ...
}

As noted above, this keeps the syntax similar to that of constraints.

Pzixel commented 7 years ago

@TonyValenti it's ruining current C# style so it won't be applied. We have two syntaxes: C-like when you declare type and then variable name, and Pascal-like, when you are specifying it after variable name. The latest is used in languages with strong type inference, and C# 1.0 wasn't it. Thus, we will have C-style until C#'s death.

So answering your question, it's just easy to forbid explicit type specification in those cases (like it is for anonymous types). So it's qute easy:

var x = (object) Something();
var y = (IAnimal, IMammal) x;
var z = x as IAnimal, IMammal;
Pzixel commented 7 years ago

@HaloFour

The compiler would emit a local for each type, assign them all to the same instance and silently switch between the locals depending on the context of its use. I initially thought that this might be an issue, especially if y was passed to a method as ref, but since pattern variables are readonly I don't think that would be an issue.

But what if we call a method where both aspects are required? I think this feature should generate intersection interface for every place where we are using this syntax and add it to all classes which implements both of them. It's probably better to generate interface on the fly in runtime but it doesn't matter, because what matters is that it defenitly solves the problem with, for example, IDisposableList:

if (a is (IList<int>, IDisposable) disposableList)
{
   Consume(disposableList);
}

...

public void Consume<T, TItem>(T list) where T : IList<TItem>, IDisposable
{
    var index = list.IndexOf(default(TItem));
    list.Dispose();
}

There is no way to do it with silent switch

HaloFour commented 7 years ago

@Pzixel

It'd be doable, but it'd be messy. I don't know that the compiler would have a choice but to resort to reflection to obtain the open generic type/method and then close it with the original type of a. Then the compiler would invoke the member dynamically. It would carry a bit of overhead.

Another option would be for the compiler to emit a proxy type which does meet the constraint and to use the generic type/method with that proxy type. That would eliminate the overhead of reflection but the consuming code would be dealing with a different type which could have other unwanted consequences.

I don't see how the compiler could ever support that correctly without the CLR understanding the concept of intersection types. Otherwise there's no way for the compiler to correctly close that generic at compile-time.

jnm2 commented 7 years ago

I'm pretty sure that this would just end up being frustrating without CLR support.

Pzixel commented 7 years ago

@HaloFour I see your point. For example we may ask for if (a is (IFoo, IBar) foobar) and behaviour may differs based on if there is IFooBar interface and we can use it or we should generate another one? I think there is simple answer: just generate a proxy interface always. It's fine with anonymous types, why not generate anonymous interfaces? If user whant specific interface it always can manually cast to IFooBar instead of (IFoo, IBar), so if it writes this code, it's explicitely asks us to generate some proxy interface for him just because he is too lazy to write IQuadrupedMammalReallyWeirdFromAustralia for every combination of interfaces required.

HaloFour commented 7 years ago

@Pzixel

I don't understand what a generated composite interface would accomplish? If the underlying type doesn't implement that interface it still couldn't be cast to it or used as a generic type argument.

That also doesn't take into account the possibility of base types being included in the intersection.

Pzixel commented 7 years ago

@HaloFour it would accomplish situation, when you need methods from interfaces IA and IB, but there is no common divider for them. And underlying type can't implement an interface which is does not exist yet. And this is exactly what we are trying to do: allow compiler to generate this interface for us. You can specify I take everything that implements IFoo and IBar but you cannot pass a variable of type Foo and Bar. It's not fair I think.

That also doesn't take into account the possibility of base types being included in the intersection.

That's not true. I said Always generate another interface. So if there is already an IFooBar interface we just ignore it because it may add additional methods which is NOT what type intersection is.