dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.85k stars 4.01k forks source link

C# Design Notes - catch up edition, Feb 29, 2016 (deconstruction and immutable object creation) #9330

Closed MadsTorgersen closed 8 years ago

MadsTorgersen commented 8 years ago

C# Language Design Notes Feb 29, 2016

Catch up edition (deconstruction and immutable object creation)

Over the past couple of months various design activities took place that weren't documented in design notes. The following is a summary of the state of design regarding positional deconstruction, with-expressions and object initializers for immutable types.

Philosophy

We agree on the following design tenets:

Positional deconstruction, with-expressions and object initializers are separable features, enabled by the presence of certain API patterns on types that can be expressed manually, as well as generated by other language features such as records.

API Patterns

API patterns for a language feature facilitate two things:

It turns out the biggest design challenges are around the second part. Specifically, all these API patterns turn out to need to bridge between positional and name-based expressions of the members of types. How each API pattern does that is a central question of its design.

Assume the following running example:

public class Person
{
  public string FirstName { get; }
  public string LastName { get; }

  public Person(string firstName, string lastName)
  {
    FirstName = firstName;
    LastName = lastName;
  }
}

In the following we'll consider extending and changing this type to expose various API patterns as we examine the individual language features.

Here's an example of using the three language features:

var p = new Person { FirstName = "Mickey", LastName = "Mouse" }; // object initializer
if (p is Person("Mickey", *)) // positional deconstruction
{
  return p with { FirstName = "Minney" }; // with-expression
}

Semantically this corresponds to something like this:

var p = new Person("Mickey", "Mouse"); // constructor call
if (p.FirstName == "Mickey") // property access
{
  return new Person("Minney", p.LastName); // constructor call
}

Notice how the new features that use property names correspond to API calls using positional parameters, whereas the feature that uses positions corresponds to member access by name!

Object initializers for immutable objects

(See e.g. #229)

This feature allows an object initializer for which assignable properties are not found, to fall back to a constructor call taking the properties' new values as arguments.

new Person { FirstName = "Mickey", LastName = "Mouse" }

becomes

new Person("Mickey", "Mouse")

The question then is: how does the compiler decide to pass the given FirstName as the first argument? Somehow it needs clues from the Person type as to which properties correspond to which constructor parameters. These clues cannot just be the constructor body: we need this to work across assemblies, so the clues must be evident from metadata.

Here are some options:

1: The type or constructor explicitly includes metadata for this purpose, e.g. in the form of attributes. 2: The names of the constructor parameters must match exactly the names of the corresponding properties.

The former is unattractive because it requires the type's author to write those attributes. It requires the type to be explicitly edited for the purpose.

The latter is better in that it doesn't require extra API elements. However, API design guidelines stipulate that public properties start with uppercase, and parameters start with lower case. This pattern would break that, and for the same reason is highly unlikely to apply to any existing types.

This leads us to:

3: The names of the constructor parameters must match the names of the corresponding properties, modulo case!

This would allow a large number of existing types to just work (including the example above), but at the cost of introducing case insensitivity to this part of the C# language.

With-expressions

(see e.g. #5172)

With-expressions are similar to object initializers, except that they provide a source object from which to copy all the properties that aren't specified. Thus it seems reasonable to use a similar strategy for compilation; to call a constructor, this time filling in missing properties by accessing those on the source object.

Thus the same strategies as above would apply to establish the connection between properties and constructor parameters.

p with { FirstName = "Minney" }

becomes

new Person("Minney", p.LastName)

However, there's a hitch: if the runtime source object is actually of a derived type with more properties than are known from its static type, it would typically be expected that those are copied over too. In that case, the static type is also likely to be abstract (most base types are), so it wouldn't offer a callable constructor.

For this situation there needs to be a way that an abstract base class can offer "with-ability" that correctly copies over members of derived types. The best way we can think of is to offer a virtual With method, as follows:

public abstract class Person
{
  ...
  public abstract Person With(string firstName, string lastName);
}

In the presence of such a With method we would generate a with expression to call that instead of the constructor:

p.With("Minney", p.LastName)

We can decide whether to make with-expressions require a With method, or fall back to constructor calls in its absence.

If we require a With method, that makes for less interoperability with existing types. However, it gives us new opportunities for how to provide the position/name mapping metadata thorugh the declaration of that With method: For instance, we could introduce a new kind of default parameter that explicitly wires the parameter to a property:

  public abstract Person With(string firstName = this.FirstName, string lastName = this.LastName);

To explicitly facilitate interop with an existing type, a mandatory With method could be allowed to be provided as an extension method. It is unclear how that would work with the default parameter approach, though.

Positional deconstruction

(see e.g. #206)

This feature allows a positional syntax for extracting the property values from an object, for instance in the context of pattern matching, but potentially also elsewhere.

Ideally, a positional deconstruction would simply generate an access of each member whose value is obtained:

p is Person("Mickey", *)

becomes

p.FirstName == "Mickey"

Again, this requires the compiler's understanding of how positions correspond to property names. Again, the same strategies as for object initializers are possible. See e.g. #8415.

Additionally, just as in with-expressions, one might wish to override the default behavior, or provide it if names don't match. Again, an explicit method could be used:

public abstract class Person
{
  ...
  public void Person GetValues(out string firstName, out string lastName);
}

There are several options as to the shape of such a method. Instead of out-parameters, it might return a tuple. This has pros and cons: there could be only one tuple-returning GetValues method, because there would be no parameters to distinguish signatures. This may be a good or a bad thing.

Just as the With method, we can decide whether deconstruction should require a GetValues method, or should fall back to metadata or to name matching against the constructor's parameter names.

If the GetValues method is used, the compiler doesn't need to resolve between positions and properties: the deconstruction as well as the method are already positional. We'd generate the code as follows:

string __p1;
string __p2;
p.GetValues(out __p1, out __p2);
...
__p1 == "Mickey"

Somewhat less elegant for sure, and possibly less efficient, since the LastName is obtained for no reason. However, this is compiler generated code that no one has to look at, and it can probably be optimized, so this may not be a big issue.

Summary

For each of these three features we are grappling with the position-to-property match. Our options:

  1. Require specific metadata
  2. Match property and parameter names, possibly in a case sensitive way
  3. For deconstruction and with-expressions, allow or require specific methods (GetValues and With respectively) to implement their behavior, and possibly have special syntax in With methods to provide the name-to-position matching.

We continue to work on this.

HaloFour commented 8 years ago

:+1: for design notes!

MgSam commented 8 years ago

I second HaloFour's sentiment.

I'll echo my thoughts about positional decomposition from the other thread- I think its a step in the wrong direction. It makes the code harder to read and understand without using Intellisense. With Autocomplete, the cost of having to type property names is tiny, so why avoid it?

Are record types still on the table for C# 7.0? I notice your Person example doesn't use them. Those are by far the more interesting and useful feature to me.

chrisaut commented 8 years ago

I just don't understand what this gives us:

if (p is Person("Mickey", *)) // positional deconstruction

over specifying the property names, other than saving a few keystrokes, at the, IMO, huge cost of much less readability by introducing what just seems like compiler magic where everytime I have to look up the types ctor (or GetValues method) to understand what the code is doing. Why can this not be something like

if (p is Person { FirstName == "Mickey"}) 

or even clearer IMO

if (p is Person where FirstName == "Mickey") 

The object initializer trying to find matching ctor parameter names seems a bit strange too. Ok, getter only properties cannot be written to outside a ctor, but as far as I understand that is not really enforced at the clr level (eg. reflection can write to readonly fields), so why not just solve the problem this way (emit code that just does it anyways)? If that's not possible, why can we not change the clr (yes I know its a high bar, but this is C#, the posterchild language for the CLR, it seems we need to find all these workarounds instead of changing/evolving things if needed).

I guess I'm just not too hot on magic transformations that depends on argument positions and naming conventions for core language features.

PS: Sorry if this sounds like a bit of a rant, I'm sure you guys are working hard, and I love C# and most of the things going on with it.

ErikSchierboom commented 8 years ago

First, let me say that I love all three suggested features. However, as @chrisaut, I also feel this needs some polishing. The positional deconstruction is a bit hard to read IMHO. I feel that this is due to the fact that positional deconstruction is of course, based on the positions of the parameters. However, that is not something that I keep in my head most of the time. Why not make things more explicit by having syntax that refers to the properties themselves?

The example provided by @chrisaut that uses the where keyword is my favorite:

if (p is Person where FirstName == "Mickey") 

I feel that this better represents what is happening, namely that you first check if p is a Person and then subsequently check if the FirstName is equal to "Mickey". If we look at the positional deconstruction example (if (p is Person("Mickey", *))), this is far less apparent to me.

Another benefit of the where syntax is that it enables all types of boolean expressions to be used. For example, we could do this to match persons which FirstName starts with and "M":

if (p is Person where FirstName.StartsWith("M"))

The positional destructuring doesn't naturally allow this as far as I can tell. It also doesn't easily allow and or or statements, which are also possible in the where syntax:

if (p is Person where FirstName.StartsWith("M") && FirstName.EndsWith("y"))

if (p is Person where FirstName == "Mickey" || FirstName == "Alice")

My final argument in favor of the where syntax (or something similar) is that to me it feels more C#-ish. While this is hard to define, I feel that the positional structuring is not explicit enough, whereas the where syntax kinda looks like the exception filtering meets LINQ.

dadhi commented 8 years ago

How is the proposed deconstruction is different from?

if ((p as Person)?.FirstName == "Mickey")
maloo commented 8 years ago

Please don't do positional deconstruction. C# has always been easy to read and reason about. Like defining ref/out. The "initializer syntax" is so much easier to understand and require no extra magic. Or at least show an example where positional deconstruction would be a better option.

omariom commented 8 years ago
p is Person("Mickey", *)

That will generate a lot of traffic to StackOverflow :)

omariom commented 8 years ago

I like if (p is Person where FirstName == "Mickey") but it will confuse LINQ quieries.

What about this?

Person p;
object obj;

if (p is Person && p.FirstName == "Mickey")

if (obj is Student s && s.FirstName == "Mickey")
ErikSchierboom commented 8 years ago

@omariom Not bad at all!

chrisaut commented 8 years ago
if (obj is Student s && s.FirstName == "Mickey")

@omariom I like this one the most, it's very C#ish and crystal clear what it does IMO.

BTW I don't think the where would confuse LINQ queries, where already is a contextual keyword (only a keyword if already inside an open Linq expression). I didn't even think about Linq when I proposed it, I thought of the where used in Generics (eg. Class<T> where T : struct)

jcdickinson commented 8 years ago

@omariom when do we get this && operator? It's seems very useful!

On a serious note your expression syntax example is a specialization of #254, albeit an interesting and useful one.

HaloFour commented 8 years ago

@chrisaut

Such a syntax is already proposed (#206) and would work on any existing type:

if (obj is Person { FirstName is "Mickey" }) { ... }

In simpler cases you could combine the type pattern with other conditions:

if (obj is Person p && p.FirstName == "Mickey") { ... }

The positional construction, matching and deconstruction is a feature of records. The purpose of these additional proposals is to bring some of that functionality to existing types.

nirvinm commented 8 years ago

This seems to have more clarity.

if (obj is Person with FirstName == "Mickey") {
}

But this proposal don't bring much readability. new Ojb() is better than 'with' expression. Anyway please don't bring special 'With' or other methods like Python does. It is awful.

alrz commented 8 years ago

@HaloFour I think you're refering to let syntax, because there is no such guard for is operator and a logical AND would do.

HaloFour commented 8 years ago

@alrz Ah you're right, the proposal does specifically mention when as a part of switch, match and let. So yes, normal Boolean operators would apply instead. I'll update my comment.

alrz commented 8 years ago

I don't understand why people are freaking out of the syntax in its most basic form; do you guys ever heard of the word "pattern"? And suggestions involving where and with are all ambiguous. :confused:

orthoxerox commented 8 years ago

I agree with the rest of the peanut gallery that silent automatic decomposition based on constructors is a bad idea. I'd rather use property patterns or implement an extension method to deconstruct existing classes. New classes could get a primary constructor with restricted semantics for those cases when records aren't complex enough.

dsaf commented 8 years ago

The best way we can think of is to offer a virtual With method, as follows:

public abstract class Person { ... public abstract Person With(string firstName, string lastName); ... public void Person GetValues(out string firstName, out string lastName); ...

Shouldn't these be new operator declarations instead? I don't know how to explain it, but it feels wrong tying language constructs to type members without a more explicit contract. Somehow the IEnumerable<T> - select and Task<T> - async and IDisposable - using relationships are more obvious...

HaloFour commented 8 years ago

@dsaf

As operators or as extension methods they couldn't be virtual and it's largely necessary for them to be virtual so that derived types can correctly copy over the properties that aren't specified in the signature:

public class Person {
    public string FirstName { get; }
    public string LastName { get; }

    public Person(string firstName, string lastName) {
        this.FirstName = firstName;
        this.LastName = lastName;
    }

    public virtual Person With(string firstName, string lastName) {
        return new Person(firstName, lastName);
    }
}

public class Student : Person {
    public int Grade { get; }

    public Student(string firstName, string lastName, int grade) : base(firstName, lastName) {
        this.Grade = grade;
    }

    public override Student With(string firstName, string lastName) {
        return With(firstName, lastName, this.Grade);
    }

    public virtual Student With(string firstName, string lastName, int grade) {
        return new Student(firstName, lastName, grade);
    }
}

...

Person person = new Student("Foo", "Bar", 98);

Person person2 = person with { LastName = "Baz" };
Debug.Assert(person2 is Student { FirstName is "Foo", LastName is "Baz", Grade is 98 });
omariom commented 8 years ago

@chrisaut

I don't think the where would confuse LINQ queries, where already is a contextual keyword (only a keyword if already inside an open Linq expression).

I meant this case:

from p in persons
where p is Student s 
where s.FirstName == "Mickey"
select p;

Even if it don't confuse the compiler it will confuse me )

KathleenDollard commented 8 years ago

Could the team explain the perceived value of positional decomposition?

Initial knee-jerk response - For #$@#$ sake don't do that!

We can fine tune syntax, but just write that puppy out with our archetypal point example. Any two ints or bools, much less three or four. OMG!

The rest is coming along very nicely :)

dsaf commented 8 years ago

@HaloFour Understood, thanks!

JiriZidek commented 8 years ago

1) Object initializer for immutables - maybe optional parameters in constructor (similarily as in Attribute declaration) is enough ? var p = new Person( LastName: "Taylor"); In general I guess that implicit constructor for each public property can be generated using the very same parameter names, so it would be enough just to indicate in class declaration, that such constructor shoud exist - why not just name of class without parentheses ? public Person; compiles as public Person(string FirstName, string LastName) { this.FirstName=FirstName; this.LastName=LastName; } and when these params are optional, there is no big step to allow initializer to call such constructor. 2) p is Person("Mickey",*) OR p is Person where FirstName == "Mickey" we know that p IS Person, so why not use just p?.FirstName=="Mickey" ? I do not se any benefit in deconstruction just for eqauality. Instead of this I would appretiate methods returning multiple values - like: var (n,x) = GetMinMax(int[] row).It solves some real pain... 3) p with Person("Minnie",*) - I would suggest simplier syntax - maybe I just dislike word "with" from VB: var r = p { FirstName="Minnie" }; That's my opinion. Jiri

gafter commented 8 years ago

This "summary" exposes a debate that we've been having in the LDM: whether or not we should use name-matching (across distinct APIs) to drive language features, and in particular to get positional behavior. This summary describes the situation with the assumption that we should. The other side is that we should not.

API patterns for a language feature facilitate two things:

  • Provide actual APIs to call at runtime when the language feature is used
  • Inform the compiler at compile time about how to generate code for the feature

That is not the traditional role of API patterns in C#. It has always been the former (API elements that the compiler invokes). I think it would not be advisable for us to have an "API pattern" where the compiler finds the API pattern and then does something other than invoke it. All of the difficulties of inferring the latter from the former can be avoided if we just don't do that.

Object initializers for immutable objects

Not sure why we're discussing this. It has been low on the list of things we're likely to consider for C# 7 for some time. Is it really that much of an advantage to allow people to type more characters for an alternative way to invoke a constructor, especially with all of the semantic issues that are thereby self-created?

Positional deconstruction Ideally, a positional deconstruction would simply generate an access of each member whose value is obtained:

That is not ideal. Properties do not have "positions". The ideal is an API pattern that is invoked by the compiler.

HaloFour commented 8 years ago

@gafter

I tend to agree. If the compiler is going to allow record shorthand for non-record types I'd prefer it be through the same conventions established for record types. If a typical C# record generates a With method and/or a GetValues method which are used for "withers" or deconstruction respectively then I could see those features being enabled for CLR types that expose those same methods (or through extension methods in scope). Otherwise I don't think it's worth it.

alrz commented 8 years ago

Couldn't agree more :arrow_up: :+1:

just don't do that.

As for "object initializers for immutable objects" I don't think that it will be any useful in presence of record types, considering that you still need to declare the constructor. Also, if the class requirements go beyond of a record, it woudn't make sense to provide the same syntax for deconstruction out of the box.

isaacabraham commented 8 years ago

Positional deconstruction for tuples, yes. For records / classes with named properties - no.

jakeswenson commented 8 years ago

I love all of this. Would you be able to use named arguments in positional deconstruction? (named deconstruction?)

p is Person(firstName: "Mickey", lastName: *)
HaloFour commented 8 years ago

@jakeswenson

Why wouldn't property patterns be sufficient for that purpose?

p is Person { FirstName is "Mickey" }
paulomorgado commented 8 years ago

@HaloFour, operators would inded kill the chance of having virtual methods. But extension methdos wouldn't. Pretty much like LINQ - operators can translate to either instance methods or extension methods.

@MadsTorgersen, @gafter, What's the use case of this?

public abstract Person With(string firstName = this.FirstName, string lastName = this.LastName);

Isn't this opening a can of worms?

Would this (or something like this) be valid:

if (p is Person(lastName: "Mouse", firstName: "Mickey")) { ... }
if (p is Person(firstName: "Mickey", lastName: *)) { ... }
if (p is Person(firstName: "Mickey")) { ... }
if (p is Person(FirstName: "Mickey", LastName: *)) { ... }
if (p is Person(FirstName: "Mickey")) { ... }
HaloFour commented 8 years ago

@paulomorgado

Extension methods have the exact same problem that operators do, the dispatch is determined at compile time and not at run time based on the actual type of the instance. In my example above the variable is Person, not Student, so any extension method it would resolve would be for this Person, not this Student.

paulomorgado commented 8 years ago

@HaloFour, you can put the extension methods in any extension class you want to and virtual methods would still be possible.

The exact same thing happens with LINQ. Not all enumerables are alike.

HaloFour commented 8 years ago

@paulomorgado You could, but that only works based on the type of the variable since that's the only thing that the compiler knows. This is very different from virtual methods where the dispatch is determined at runtime based on the exact type:

public static class PersonExtensions {
    public static string Test(this Person person) {
        return "Person";
    }

    public static string Test(this Student person) {
        return "Student";
    }
}
...
Person person = new Student();
 // compiler resolves to extension method of Person
string result = person.Test();
Debug.Assert(result == "Person");

vs.

public class Person {
    public virtual string Test() {
        return "Person";
    }
}

public class Student : Person {
    public override string Test() {
        return "Student";
    }
}
...
Person person = new Student();
// compiler virtually calls Person.Test which is dispatched to Student.Test at runtime
string result = person.Test();
Debug.Assert(result == "Student");

If Person or an extension method for Person performs the "wither" the Grade property will most certainly be lost and the return type would be a Person, not a Student. That's why they would need to be virtual instance methods, and in the case of abstract types also abstract.

mythz commented 8 years ago

Positional deconstruction

I struggle to see how this feature justifies the complexity and new concepts it would add to C#. In just 1-line we're introducing is against a constructor, using a constructor without new, and the yet-seen-before wildcard in C#:

if (p is Person("Mickey", *)) // positional deconstruction

What are all these new concepts saving us from writing, this?

if ((p as Person)?.FirstName == "Mickey") {
}

Which is even less readable than above as you'll need to either carry the constructor definition in your head or routinely consult the class definition to find out what the condition is matching against.

Note: just because something is more terse doesn't make it more readable.

If we also wanted multiple conditions we add a static function and get readable, typed syntax today:

using static Test;
public class Test {
    public static bool match<T>(T o, Func<T, bool> predicate) => o != null && predicate(o);
}

if (match(p as Person, p => p.FirstName == "Mickey" && p.LastName == "Mouse")) {
}

With no added complexity cost to language or tooling. Tho would be nice if the compiler could optimize away the method call stack + lambda overhead for us :)

Object initializers for immutable objects

Whilst not as bad as positional deconstruction, I'm not seeing the benefits from immutable object initializers either:

var p = new Person { FirstName = "Mickey", LastName = "Mouse" }; // object initializer

From what I can tell it's just an alternative to:

var p = new Person(firstName:"Mickey", lastName:"Mouse");

var p = new Person("Mickey", "Mouse");

But worse as it relies on non-equivalent name matching, a reminder from Java with how bean properties map to Java getter/setter methods by naming convention - another ugly corner of the language.

The main friction I have with immutable objects are that they're harder to genericize using reflection, (e.g. serializers/mappers), which this feature isn't helping with.

With-expressions

Not in love with the syntax but fills a need that would require a bit of machinery without language support. My preference would probably be to go with a syntax operator variety.

p { FirstName = "Minney" }
p + { FirstName = "Minney" }
p .+ { FirstName = "Minney" }
p <- { FirstName = "Minney" }

Language Complexity

Whilst on topic of adding unnecessary language features using wildcards, I'll leave a reminder of the dangers of adding language features with the retrospect of adding wildcards to Java generics:

“I am completely and totally humbled. Laid low. I realize now that I am simply not smart at all. I made the mistake of thinking that I could understand generics. I simply cannot. I just can't. This is really depressing. It is the first time that I've ever not been able to understand something related to computers, in any domain, anywhere, period.”

a feature which led Joshua Bloch (one of the designers of Java platform) to conclude:

"We simply cannot afford another wildcards"

MadsTorgersen commented 8 years ago

@mythz To be clear, these are features that are in support of pattern matching, and of immutable objects that we expect to have more and more of with the help of the records feature. I realize I posted them a bit out of context here, and I should have made that more clear. Over the coming days there'll be notes on those other features, to help evaluate in context.

You're seeing the sausage get made. Your feedback and that of others helps us make the right decisions in the end - just as in C# 6.

Funny you bring up Wildcards. I bear my part of the responsibility for that very feature, though in my defense I was an academic at the time. ;-) They are a perfect example of optimizing for expressiveness instead of usability.

HaloFour commented 8 years ago

@mythz

The wildcards proposed for pattern matching are nothing like the wildcards in Java, the latter involving generics and variance. And even Java's wildcards aren't that bad, they express the variance in a pretty clever way and because you can express them at the variable and parameter level you have much more freedom than you do in .NET. Java's real problem is that the generic type inference is backwards (compared to C#) which makes generic resolution in general a massive string of four-letter-words and the compiler error messages are ridiculously cryptic.

gafter commented 8 years ago

a feature which led Joshua Bloch (one of the designers of Java platform) to conclude: "We simply cannot afford another wildcards"

Josh's comments were regarding the addition of closures/lambda expressions to Java, which he opposed starting in 2006-2007 when I proposed them (he continued to oppose them even as a member of the JSR, preferring something based on inner classes). But lambdas seem to have worked out brilliantly for Java.

As it turns out, both @MadsTorgersen and I (@gafter) can claim some responsibility for Java's wildcards, with me as the compiler engineer at Sun driving the language implementation (for all of the Java 3, 4, and 5 changes) and Mads leading a team of academics assisting with the specification and implementation. We would have liked use-site variance, but driven by Josh Bloch's requirement to retrofit generics onto his existing collection APIs without rewriting any of those APIs, we were able to find no other language solution. It would have been better for Java to ignore those requirements and follow C#'s model of declaration-site variance (in and out on type parameters of interfaces), which would have resulted in a new set of generic collection APIs, as it did for C#. Josh was well aware of the shape of generics at every step of the way, and played no small part in their design. Of course, hindsight is 20/20.

As has been pointed out, these "wildcards" have nothing to do with the kind of wildcards that appear in Java's type system, and they are not part of the type system in any sense.

gafter commented 8 years ago

@HaloFour wrote

Java's real problem is that the generic type inference is backwards (compared to C#) which makes generic resolution in general a massive string of four-letter-words and the compiler error messages are ridiculously cryptic.

You're welcome.

We won't do that.

paulomorgado commented 8 years ago

@HaloFour, using extension methods does not forbide the use of virtual methods. Again: LINQ.

Inside the extension method you can also have logic to handle the correct types at runtime:

public static class PersonExtensions 
{
    public static string Test(this Person person)
    {
        if (person is Student)
        {
            return "Student";
        }

        return "Person";
    }
}

(I'm not goint into pattern matching here. Just using what we can get our hands on now).

If you are into it, you can always go with late binding:

public static class PersonExtensions 
{
    public static string Test(this Person person)
    {
        return TestInternal(person);
    }

    private static string TestInternal(dynamic person)
    {
        return TestInternalImpl(person);
    }

    private static string TestInternalImpl(Student person)
    {
        return "Student";
    }

    private static string TestInternalImpl(object person)
    {
        return "Person";
    }
}

Come to think of it, you might do the same with an operator. But I think the method invocation (which gives you the option of being an instance or extension method) is more versatile.

DavidArno commented 8 years ago

Of the three features, the With-expressions are the obvious biggest gain, as they simplify the creation of new immutable references/values. However, regarding,

... there's a hitch: if the runtime source object is actually of a derived type with more properties than are known from its static type, ...

the ideas behind With methods seem highly inelegant. Another option would simply to be to restrict this feature to sealed types. Whilst this would limit its use in existing types, it would fit well with records (assuming they will be syntactic sugar over structs and sealed classes) and would remove the whole issue of derived types.

I really like the idea of extending object initialisers to allow constructor parameters to be specified with the same syntax as properties. However, rather than ideas around ignoring case in this instance, a simple style convention change could suffice: parameter names for immutable types should match the read-only properties exactly for both this feature and with-expressions to work. This could also then address @mythz's comments about immutable types and serialization.

The positional deconstruction feature makes most sense for tuples. For records and traditional types, less so. As others have commented, this could lead to less readable code all too often.

DavidArno commented 8 years ago

I'm pleased to read @MadsTorgersen's comments that more design notes, around other language features, are coming soon. It will help to put the ideas in this note into perspective. Plus, as many of us are keen to know what's happening with tuples, discriminated unions, records and pattern matching, this will help. I'm particularly concerned by my reading of @gafter's remark in #9375

Related features possibly to be added at the same time as pattern-matching:

5154 expression-based switch ("match")

Which I read as pattern-matching expressions are still only being considered, whereas expression statements (the bloated switch statement) will definitely happen. Hopefully, this is a misreading of his words, but other design notes should clarify that.

HaloFour commented 8 years ago

@paulomorgado

Neither of those strategies are virtual methods, they are type-based dispatch and they are both completely locked to the types explicitly built into the implementation. It would be impossible to add derived types in other assemblies and to "override" those extension methods, regardless of the dispatch strategy.

This is a problem with "wither" methods and non-sealed types in general. It would be way too easy to accidentally call a base version which ends up "decapitating" the result to an instance of the base class.

https://github.com/dotnet/roslyn/issues/5172#issuecomment-147881628

lachbaer commented 8 years ago

Question: will the immutable objects be also possible with struct s? (a bit like Javas Valhalla project?)

MadsTorgersen commented 8 years ago

@gafter and others: I did not intend for this summary to imply that we agree to do this with name matching. That is an open debate. For both with-expressions and positional deconstruction I listed the alternatives we discussed. I apologize if you feel I didn't give fair treatment to that side.

I realized after posting (and upon seeing some of the negative comments on using name matching) that I did not give due space to an option we discussed earlier, involving variants of the builder pattern. I will write that up shortly.

MadsTorgersen commented 8 years ago

I wrote up #9411 as an alternative way to implement these patterns, based on using tuples and the builder pattern. I wrote it as a proposal rather than design notes, because we haven't had the same degree of discussion of it on the language design team yet.

Would love your input.

MadsTorgersen commented 8 years ago

@DavidArno Nothing is set in stone. Our current plan of record is that pattern matching will be integrating into existing language constructs (is-expressions and switch statements) as well as a new expression form which we call a match expression.

As we experiment with things, those plans may change. We may choose to add patterns in more places (for instance, let-statements have been proposed), or we may trickle them in over multiple releases (e.g. adding match expressions later).

The way this stuff works is that we're in design mode till we ship. We ship things we have high confidence in, and drop things we don't, sometimes last minute. We publish our design notes so that we can get feedback along the way. This means that you get to see a bunch of intermediate states of design; a lot of decisions that end up being reversed later: you see the sausage getting made. Hopefully that's entertaining and useful. :-)

Make sense?

Mads

CyrusNajmabadi commented 8 years ago

What are all these new concepts saving us from writing, this?

@mythz I think in the small examples that people have been providing, there is only marginal value. The real benefit comes when you need do much more "matching" in a single expression. A canonical example of that would be where you're trying to actually match a complex tree structure. Patterns already make this cleaner. Furthermore, if we want to both match and be able to then use the sub-pieces that have been matched, then patterns are a huge boon.

Take a red-black tree as an example, and let's say you're trying to match against this form:

image which we want to transform to: image

Trying to do so be pretty unpleasant in an imperative fashion. But with a pattern i can write something like so:

tree is Node {
  Color is Black,
  Val is var x,
  Left is var a,
  Right is Node {
    Color is Red,
    Val is var y,
    Left is var b,
    Right is Node {
      Color is Red,
      Val is var z,
      Left is var c,
      Right is var d
    }
  }
}

If i had to do all that decomposition myself it would be extremely tedious and onerous.

Now, i personally even find the above to be rather heavyweight. If i'm in my own domain, and i know that a node a quad of a "color, value, left node, right node" then i might even prefer just stating this more succinctly as:

tree is Node(Black, var x,
    var a,
    Node(Red, var y, 
        var b,
        Node(Red, var z, var c, var d))))

I can now easily see what shape this matches against. Specifically a "black" top node, with a right Red child which itself has a right Red child. The constituent pieces i need are now captured in variables (with the right types) that i can now use. It's super clean and much easier to read than the existing idiomatic C#.

In this particular example, once i matched my pattern i would be actually want to create a new node like so: new Node(Red, y, new Node(Black, x, a, b), new Node (Black, z, c, d))

Having had to write this sort of code imperatively in C# is much worse. I would have to instead write:

(tree as Node)?.Color == Black &&
(((Node)tree).Right as Node)?.Color == Red &&
(((Node)((Node)tree).Right) as Node)?.Color == Red {
    var topNode = (Node)tree;
    var rightChild = (Node)topNode.Right;
    var rightGrandChild = (Node)rightChild.Right
    new Node(Red, rightChild.value, new Node(Black, topNode .value, topNode.Left, rightChild.Left), new Node (Black, rightGrandChild.Value, rightGrandChild.Left, rightGrandChid.Right))
}

Or, if i wanted to avoid all those casts in the expression, i'd have to do things like:

if ((tree as Node)?.Color == Black) {
  var topNode = (Node)tree;
  if ((topNode.Right as Node)?.Color == Red) {
    var rightChild = (Node)topNode.Right;
    if ((rightChild.Right) as Node)?.Color == Red {
       var rightGrandChild = (Node)rightChild.Right

And all of that code is just for one case that i'm matching against. In practice there would be very many more.

And thank goodness i at least have ?. for C# 6.0 to help me out here. Having to write this without that feature is even more painful :)

CyrusNajmabadi commented 8 years ago

@MgSam > With Autocomplete, the cost of having to type property names is tiny, so why avoid it?

I'm not certain i personally agree with that. Take the possible C# 7.0 examples i listed above. Haivng to write the name increases the size of hte pattern from 84 to 191 chars. That's more than twice the size, and makes it nearly impossible to fit cleanly in a single line (or even on a few lines).

Now, i am not arguing that positional matching makes sense in all cases. Nor would i ever insist that you must use, or that you should not use named pattern matching.

However, positional matching can be very valuable in certain pieces of code as determined by the author of that code. If this is my code, and i know what the shape of a node is, then making me specify all the names makes the code very heavyweight compared to the version where i eschew it.

I'd like to make what i think is an appropriate analogy here. Deconstruction is the reverse of construction. When i construct something, there is no requirement that i provide names for things. For example, today, in my red-black library, i would definitely write:

new Node(Red, someVal, oneNode, otherNode)

In that code i never specify the names of those parameters. Previously you mentioned:

It makes the code harder to read and understand without using Intellisense

And that's true. But we still live with it because having to type this

new Node(color: Red, value: someVal, left: oneNode, right: otherNode)

at every construction site would be excessive. We certainly allow people to be that expressive when constructing things. But we do not require it, despite that meaning there is less information in the code for you to tell what's going on.

As deconstruction is the reverse of construction, i feel the same holds true. If you want to deconstruct using names, by all means do so! If you feel it makes your code cleaner and easier to read, then by all means go right ahead.

However, like with construction, if you feel like such explicitness is unnecessary and heavyweight, you can also avoid it if you want. in other words, feel free to write:

x is Node { Color is Red, Val is var val, Left is var left, Right is var right } or x is Node(Red, var val, var left, var right)

Whichever feels best to you given what you're doing. Personally i think the latter form is much nicer for this domain. But i wouldn't insist that everyone would have to use it for all domains.

mythz commented 8 years ago

@CyrusNajmabadi Not sure if there is a "compiler writer bias" with this feature, but I can't recall many times in the last decade where I've thought there was enough friction that would justify needing and learning a new "foreign" language feature for it.

The complex match expression looks fragile (and IMO not particularly readable) where I'd expect it would be hard to workout why an arbitrary data graph doesn't match a complex nested match expression. I'd also not expect debugging or tooling/intelli-sense to be able to provide as much assistance as it could with multiple statements.

IMO the minimum syntactical complexity we could add to make this more pleasant would be to use an alias (mentioned earlier in this thread), so instead of:

if ((tree as Node)?.Color == Black) {
  var topNode = (Node)tree;
  if ((topNode.Right as Node)?.Color == Red) {
    var rightChild = (Node)topNode.Right;
    if ((rightChild.Right) as Node)?.Color == Red {
       var rightGrandChild = (Node)rightChild.Right

We'd be able to do:

if (tree as Node top && top.Color == Black && topNode.Right as Node right && right.Color == Red)
{
    if (right.Right as Node rightChild && rightChild.Color == Red) {
        if (rightChild.Right as Node rightGrandChild) { ... }
    }
}

Which I believe is a "more natural fit" for C# that would also be better to benefit from existing tooling. I also believe this is more readable as it more closely matches how the written code would be interpreted in our brain as we're skimming through the source code.

CyrusNajmabadi commented 8 years ago

but I can't recall many times in the last decade where I've thought there was enough friction that would justify needing a new "foreign" language feature.

There was never much friction with null checking either. And yet, the addition of ?. is still wonderful for just smoothing out all those little bumps :)

The complex match expression looks fragile

Really? Why?

We'd be able to do ... Which I believe is a "more natural fit" for C#

That approach is nearly 3x longer than the simple positional form. While i don't believe language features should be decided solely based on their terseness, i do believe that we should strive for simpler ways to express code that does require so much cruft to be built up. Consider the simple case of "anonymous methods" and "lambdas". We could have lived with anonymous methods. After all, it was just a few more characters over a lambda. However, at the end of the day, those extra characters just end up smeared all over your code (an effect i call "peanut butter syntax" :)).

In this case, i find yoru code much harder to match (no pun intended) against the actual tree i care about. And i think i can qualify why this is. In order to understand your expression, i have to both read left to right and simultaneously see how names are introduced when later checks are done. In other words, you need to introduce named temporaries solely for matching even though they're never specified in the image, nor are they needed at all later on. This itself adds cognitive burden. In the pattern approach i've given i only name things that i actually want to use later on (namely, "a, b, c, d, x, y, z"). Nothing else needs a name, needs to be put in scope, or needs to be understood.

That allows my code to far more closely match the data i'm trying to work against. In other words, like linq, i get to think about that "what" not the "how". I get to simply express what i'd like to match against, and what i need to use. I very much see patterns as an analog to linq. With linq i got to move away from the "how" of working with collections. The "how" of iteration and temporaries, and building up results. "Patterns" are the same for me. I get to finally move away from teh "how" of checking if something fits what i'm looking for and the "how" of pulling out all the state i need. And i can think about "what" i'm doing.

Finally, wrt to positional deconstruction, consider the analogy i made with construction. All the arguments made against positional deconstruction could have been made against positional construction as well. Would we prefer C# if it mandated that you write parameter names every time you construct an object? IMO, no. It would be paying a cost over every line of code you write that would provide value in only some places.

I'm a big believer in giving the language tools to enable both. I don't think either of named or positional is superior to the other. I think there are places where both work great. But i do think when you only have one and not the other, you lose out big on having the tools to do the right thing for any piece of code you may end up writing.