dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.93k stars 4.02k forks source link

Proposal: Virtual arguments in methods #176

Closed mariusGundersen closed 8 years ago

mariusGundersen commented 9 years ago

Currently the virtual keyword can be used to mark a method as being virtual, so that calling the method on an object with the interface type will call it on the actual type:

ISomething something = GetOneImplementationOfISomething();
something.DoIt();//this will call the correct DoIt based on the actual type of something

Unfortunately the same thing cannot be done with a method that gets a parameter, for example:

private void DoIt(ISomething something){
  //this is the only one that gets called
}
private void DoIt(SomethingImplementation something){
  //this one is never called
}

ISomething something = GetOneImplementationOfISomething();
DoIt(something);

This means that extension methods can't be virtual today.

I therefore propose that the virtual keyword can be declared on one of the arguments to a method if the method isn't marked as virtual (thereby restricting it to single dispatch and not multiple dispatch). For example:

public static class Extensions
{
    public static string NameOf(virtual this ISomething something)
    {
        return "interface";
    }
    public static string NameOf(virtual this SomethingOne something)
    {
        return "one";
    }
    public static string NameOf(virtual this SomethingTwo something)
    {
        return "two";
    }
}
//...
var s = GetSomething();
s.NameOf();
//or 
Extensions.NameOf(s);

This is currently possible today using dynamic, which means that, afaict, this proposal is only syntax sugar on top what is currently possible.

lyubomyr-shaydariv commented 9 years ago

What if I accidentally import another class declaring such virtual extension methods? Let's say:

public interface IFoo {}
public interface IBar : IFoo {}

namespace A {
    // ...
    public static void Ext(virtual this IFoo foo) { ... }
    public static void Ext(virtual this IBar bar) { ... }
}

namespace B {
    // ...
    public static void Ext(virtual this IBar bar) { ... };
}

using A;
using B; // both namespaces are required
...
IFoo foo = GetFoo(); // suppose we give a Bar here
foo.Ext();

Which Ext() should be invoked? Or should I write a bridge method then to resolve A vs B conflicts?

mariusGundersen commented 9 years ago

What would happen if you did this without using virtual? I would think the same issue exists in the case of importing two extension methods with the same name on the same type from two different namespaces.

lyubomyr-shaydariv commented 9 years ago

True, in the existing non-virtual case it gives error, because the method resolution is compile-time in this case, and it's real ambiguity. However, what if, in the example above, GetFoo() returns another instance of IFoo, not IBar that's derived from IFoo? The compiler cannot be aware of the real type of the returning value, but with the above virtual extensions foo.Exit() is legit in this case, over IFoo, but not IBar. This probably should give a compiler warning, and fail in run-time if the returning value is IBar.

mariusGundersen commented 9 years ago

I would imagine that if an implementation of IFoo for which there exists no Ext method for, it would call the base, ie Ext(virtual this IFoo). That is, it would work exactly the same as if you created a virtual method in the base but didn't implement the method in the derived class.

After thinking about this for a bit I think the override keyword should be used in addition to virtual in the code example:

public static class Extensions
{
    public static string NameOf(virtual this ISomething something) { ... }
    public static string NameOf(override this SomethingOne something) { ... }
    public static string NameOf(override this SomethingTwo something) { ... }
}
HaloFour commented 9 years ago

Would this more or less be syntax candy for the following?

public static class Extensions {
    public static string NameOf(this ISomething something) {
        if (something is SomethingOne) {
            return NameOf((SomethingOne)something);
        }
        if (something is SomethingTwo) {
            return NameOf((SomethingTwo)something);
        }
        // default logic here
    }

    public static string NameOf(this SomethingOne something) { ... }

    public static string NameOf(this SomethingTwo something) { ... }
}
ilmax commented 9 years ago

Yes, dynamic dispatch is a really nice feature to have, in the meantime you could achieve same effect with the visitor pattern. I'm not sure if I like the syntax, but like the idea

mariusGundersen commented 9 years ago

Yes, it can be implemented using dynamic dispatch, like this:

public static class Extension
{
  public static string DoIt(this IFoo something){
    Extension.DoingIt(something as dynamic);
  }
  private static string DoingIt(IFoo something){
    return "fallback";
  }
  private static string DoingIt(Bar something){
    return "it is a Bar";
  }
  private static string DoingIt(Baz something){
    return "it is a Baz";
  }
}

But dynamic dispatch is much slower than a virtual call, which has almost no overhead. I don't see any reason why this can't be implemented as a virtual call, which would improve the syntax and the performance.

ilmax commented 9 years ago

As I said before, I like the feature, but not sure on syntax. About dynamic dispatch I was just referring to it as a high level concept, using dynamic keyword is a way to achieve it, but as you pointed out is slow and suffer from another major issue: it completely stop the compiler to be smart, so any static check is disabled after the dynamic keyword. To solve this, the visitor pattern is here to help you, the visitor pattern actually achieve dynamic dispatch via 2 method call, the object that should be visited "accept" the visitor and the call visit on the visitor passing itself as a parameter. This approach has some benefits, but requires a bit of code so any help from the compiler is appreciated here.

mariusGundersen commented 9 years ago

it completely stop the compiler to be smart, so any static check is disabled after the dynamic keyword.

This is an important point.

The syntax I posted is just a suggestion, I'd love to bikeshed something better.

mariusGundersen commented 9 years ago

So what would be the next steps for me to do to get this considered for C#7?

Eyas commented 9 years ago

@mariusGundersen

But dynamic dispatch is much slower than a virtual call, which has almost no overhead. I don't see any reason why this can't be implemented as a virtual call, which would improve the syntax and the performance.

Virtual calls use a virtual table that is bound an instance of a class that point to the memory location of the instructions of its methods.

To implement this as virtual methods, this implies that code in one assembly will be modifying code in other assemblies. Consider, for instance, a virtual method that operates on ICollection, IList, List, and HashSet. Under the hood, the compiler is supposed be generating extra information of the virtual tables of instances of these types, that points to the correct implementation in each case.

As others have said, this is really just syntactic sugar on top of the visitor pattern. Syntactic sugar is great in many cases, but in this case it hides ugly things. Method T Foo(virtual X bar) actually modifies type X. Leaving aside whether or not this is possible (X might be final, unclear if you want one assembly modifying another, etc.), it is also misleading.

mariusGundersen commented 9 years ago

Virtual calls use a virtual table that is bound an instance of a class that point to the memory location of the instructions of its methods.

Thanks for this explanation (and the example), it makes it much clearer to me now.

So the reason extension methods work is because they use normal call rather than callvirt, and therefore don't need to access the vtable?

So if I override a virtual method of a class in another assembly, would I not have (as you said) "one assembly modifying another"? And the compiler could check if the class used (X in your example) is final (I'm guessing you mean sealed?) and refuse the virtual extension method, since there are no subclasses for it anyways?

The only thing I don't get from your comment is what ugly things it hides or how it is misleading. Could you clarify this?

HaloFour commented 9 years ago

@mariusGundersen

You can think of the vtable as being protected, derived types are allowed to (and must) alter it in order to point to the correct address of the overridden function.

For virtual static methods you'd need a different mechanism altogether. If the "overrides" were limited to the same static class the compiler could potentially just emit a series of nested type checks:

public static class MyExtensions {
    public static string DoIt(virtual object value) {
        return "nope";
    }

    public static string DoIt(override Foo value) {
        return "It's a foo!";
    }

    public static string DoIt(override Bar value) {
        return "It's a bar!";
    }
}
// translates to
public static class MyExtensions {
    public static string DoIt(object value) {
        if (value is Foo) return DoIt((Foo)value);
        if (value is Bar) return DoIt((Bar)value);
        return "nope";
    }

    public static string DoIt(Foo value) {
        return "It's a foo!";
    }

    public static string DoIt(Bar value) {
        return "It's a bar!";
    }
}

The emitted type checks would have to observe type hierarchies so that the most derived type is checked first. Interfaces throw a series wrench into that equation as you can't really consider them to be more or less specific than any particular base class (short of maybe trying to use reflection at runtime to try to determine how far back the inheritance chain that the interface was first implemented).

To expand that beyond a single static class you'd need a vtable-like mechanism to be defined within the static class where type/implementation pairs could be registered. Then the question is how do those implementations get registered and how do you deal with scenarios like duplicate/ambiguous registrations. The caller isn't responsible for making that determination since the caller isn't the maintainer of that "vtable", it would simply call that initial "virtual" static method and hope for the best.

Eyas commented 9 years ago

So the reason extension methods work is because they use normal call rather than callvirt, and therefore don't need to access the vtable?

I believe so. I actually didn't look at the generated IL (probably a good idea to do so).

So if I override a virtual method of a class in another assembly, would I not have (as you said) "one assembly modifying another"?

Nope. A "virtual table" is bound to each instance/class. By marking a method virtual in the superclass, you know that you need to look up its entry in the instance vtable. The value of the pointer, however, only depends on the subclass you wrote.

If the superclass method wasn't marked as virtual, then you wouldn't be able to make an overriding method dispatch properly. Your only option would be using the new keyword and method hiding, but this will have the same flaws as the function you described in your original post; you can still call the "hidden" method if the subclass is upcast.

mariusGundersen commented 9 years ago

If the "overrides" were limited to the same static class

Yes, that is my assumption

The emitted type checks would have to observe type hierarchies so that the most derived type is checked first. Interfaces throw a series wrench into that equation as you can't really consider them to be more or less specific than any particular base class (short of maybe trying to use reflection at runtime to try to determine how far back the inheritance chain that the interface was first implemented).

That's a very good point. This is why C# doesn't have multiple inheritance, right? But this could be an issue with default methods in interfaces (as proposed for C#7), and the compiler would need to detect it and enforce an override method in implementations. For example,

interface IA{ string Method(); }
interface IB : IA { string Method(){ return "B"; } }
interface IC : IA { string Method(){ return "C"; } }
class D: IB, IC {} //compiler error: ambigous implementation of Method

The same solution would be needed for virtual extension methods

HaloFour commented 9 years ago

@mariusGundersen

Sort of. That limitation does exist in Java and with default methods there now exists the possibility that there would be implementation collisions which do result in an error at compile time. But the CLR supports explicit implementation and that issue largely doesn't exist. Interfaces also don't really form a proper hierarchy like classes do, a child interface can't "override" a method, only shadow it. IA::Method, IB::Method and IC::Method are three separate methods and a class implementing all three interfaces (which would happen implicitly just by implementing IB and IC) would have three virtual slots that would each have to be assigned. Because of this it's quite possible that default interface methods on the CLR wouldn't have the same issue. IB::Method and IC::Method couldn't provide an implementation for IA::Method, they could only do so for their own shadowed method. It would be up to the compiler how to resolve IA::Method. It's possible just for the sake of explicitly that omitting the implementation would be forbidden since IA::Method doesn't provide a default implementation and that the developer would be required to either implement Method thus satisfying all three slots, or implement IA::Method explicitly thus satisfying just the one required slot.

// could be a compile-time error, or could potentially assign IA::Method to IB::Method
// based on the order of implemented interfaces
class D : IB, IC { }

class E : IB, IC {
    // legal, implements all three slots
    public string Method() {
        return "E";
    }
}

class F : IB, IC {
    // legal, only implements IA::Method, other two slots assigned to default implementations
    string IA.Method() {
        return "F";
    }
}

The C# compiler, on recognizing a single public method that matches the signature, would assign all three of those slots to the same method. As long as the signature matches the CLR would allow those slots to be assigned to any method of any name and accessibility. C# limits this to explicit implementation where the method is implicitly private and the method name contains the name of the interface, but in VB.NET you're free to use the Implements keyword pretty much at will. For example the following is perfectly legal:

Public Class Foo
    Implements IDisposable

    Protected Sub Close() Implements IDisposable.Dispose
        ' dispose here
    End Sub
End Class

In either case it is the compiler, at compile time, which makes the determination for that specific class how the slots in the vtable are to be assigned.

gafter commented 9 years ago

@mariusGundersen

So what would be the next steps for me to do to get this considered for C#7?

Write a language specification and translation scheme that does not require a CLR change. If any of us believe it would be worthwhile, we'll ask you to provide a prototype in a Roslyn fork. Alternatively, if you believe a CLR change is appropriate, a specification for the language and CLR changes, and also offer to provide resources to make those changes in the top few VM implementations.

mariusGundersen commented 8 years ago

Having looked at this video on (potential) C#7 features, I'm thinking that this feature would be no more complex or ambiguous than pattern matching. In fact, it would probably be a simplification of pattern matching, and could be implemented as a translation scheme that uses pattern matching. I'm therefore going to wait a bit and see how pattern matching will be implemented in the language so I can reuse some of the pattern matching features without having to solve all those problems myself

gafter commented 8 years ago

@mariusGundersen you can start you work as a fork of the features/patterns branch, where pattern matching is implemented.

gafter commented 8 years ago

@mariusGundersen This looks like the language feature "multimethods", which I believe is much more complex than pattern-matching. It probably requires CLR support to be done well. We don't have any plans to do something like this.