dotnet / csharplang

The official repo for the design of the C# programming language
11.61k stars 1.03k forks source link

Champion "and, or, and not patterns" (VS 16.8, .NET 5) #1350

Open gafter opened 6 years ago

gafter commented 6 years ago

The idea is to add three new pattern forms

  1. pattern and pattern
  2. pattern or pattern
  3. not pattern

The latter two would not be permitted to define pattern variables in the subpatterns (as they would never be definitely assigned).

We'd have to decide on syntax and precedence, and whether some mechanism such as parentheses could be used for grouping patterns to override precedence.

Examples:

switch (o)
{
    case 1 or 2:
    case Point(0, 0) or null:
    case Point(var x, var y) and var p:
}
alrz commented 6 years ago

The latter two would not be permitted to define pattern variables in the subpatterns

For or patterns, as mentioned in https://github.com/dotnet/csharplang/issues/118 we can allow "identical types" for overlapping pattern variables,

switch (o)
{
    // x must be of the same type in both patterns

    case Point(0, var x):
    case Point(var x, 0):
        break;

    case Point(0, var x) or Point(var x, 0):
        break;
}

Alternatively we could use the "most specific common type" for overlapping variables,

switch (o) {
  case A x:
  case B x:
    CommonBase c = x;
    break;
}

but just identical-types would be still useful most of the time as demonstrated by F#.

HaloFour commented 6 years ago

I'm with @alrz, I think that allowing the operands of the or patterns to introduce pattern variables would be a powerful feature with the requirement that the pattern variables must have the same type and must be assigned by both sides.

bondsbw commented 6 years ago

An alternative to "most specific common type" is the intersection type A | B.

MgSam commented 6 years ago

I vote for using &&, ||, ! and not introducing yet more operators that do almost the exact same thing.

HaloFour commented 6 years ago

@MgSam

IIRC that would introduce ambiguities when patterns are used as Boolean expressions.

https://github.com/dotnet/roslyn/issues/6235

ufcpp commented 6 years ago

Are intersection and union types too difficult to implement?

yaakov-h commented 6 years ago

Since the first two join two separate patterns, am I correct in understanding that the following would be valid?:

if (o is IDictionary d and IEnumerable e) {
    // object.ReferenceEquals(d, e) == true
    // (assuming no boxing)
}


if (o is IDictionary and IEnumerable e) {
    // e is IEnumerable only, there is no IDictionary variable created
}
gafter commented 6 years ago

@ufcpp How would intersection and union types help with case 1 or 2:?

gafter commented 6 years ago

@yaakov-h Regarding your first example, yes. The second example doesn't make sense as IDictionary is not a pattern.

Unknown6656 commented 6 years ago

Also consider implementing xor patterns (or ^ if one does not like the word xor). It would of course -as with or- not be permitted to define pattern variables.

jnm2 commented 6 years ago

Also consider implementing xor patterns

Why?

MgSam commented 6 years ago

@HaloFour What's better? A feature that's ambiguous for the developer or one that's ambiguous for the compiler?

If they can't implement with the existing operators, it's not worth doing. This will just cause endless confusion if added like this. You'll get people trying to write:

if(condition and condition || condition)

More generally, C# team, please stop adding brand new keywords for minor use cases. The in disaster should be enough of a lesson for the reason why.

gafter commented 6 years ago

@MgSam What's worse is one that is ambiguous for both the developer and the compiler.

bondsbw commented 6 years ago

@MgSam I wouldn't call patterns a "minor use case".

if(condition and condition || condition)

This is not a pattern, or anything else that is or would be accepted by the compiler.

gafter commented 6 years ago

@MgSam You suggestion gives two different contradictory interpretations for this code, one of which is accepted today.

if (myBool is true || false)
Unknown6656 commented 6 years ago

@jnm2

Why?

~First to be consistent with or and and. Second: You often will find a case where you are semantically asking, whether an object X is either A or B, e.g. when checking for numeric non-continuous ranges, discriminated unions. ...~

Forget it - xor is a bad idea.

Richiban commented 6 years ago

With and-patterns and or-patterns we will finally have a way of comparing a variable against a given list of multiple values, just as every newbie programmer wants to do in any language :grin:


if(x is 1 or 2 or 3)
{
    // Hooray!
}
alrz commented 6 years ago

what if we add this in the LHS of the is operator 🤔

if (x and y is not null)
GeraudFabien commented 6 years ago

@Richiban newbie may do with a IEnumerable (Here i use an array). (new []{1,2,3}).Contains(x)

bondsbw commented 6 years ago

@alrz

if (x and y are not null)

FTFY. :wink:

jnm2 commented 6 years ago
if (neither x nor y are null)

😁

Thaina commented 6 years ago

This is too funny please staph

Unknown6656 commented 6 years ago

@jmn2: Looks suspiciously like VB.NET...... (.__.)

If Neither x Nor y Are Nothing Then
    ' Just kiddin' - although it would funny to see,
    ' whether the VB compiler accepts Shakespeare's works as valid code
End If
jnm2 commented 6 years ago

(x, y) = default; still doesn't seem to work. 😢

alrz commented 6 years ago

@jnm2 We'll have to wait for 8.0 (https://github.com/dotnet/csharplang/issues/1394)

geniusFunk commented 6 years ago

All of this can be done with some creativity. Newbs should be learning how to do things like this with what the language already has.

jskeet commented 6 years ago

As no-one's mentioned this yet: this is particularly important for switch expressions. Consider this existing code:

public int Foo(int input)
{
    switch (input)
    {
        case 0:
        case 1:
            return SomeMethod1() + SomeMethod2();
        case 2:
            return -1;
        default:
            return input;
    }
}

This looks like it should be refactored to an expression-bodied method with a switch expression, but until the feature of this issue is implemented, that can't be cleanly done without either a guard clause or code duplication:

// Duplication approach
public int Foo(int input) => input switch
    {
        0 => SomeMethod1() + SomeMethod2(),
        1 => SomeMethod1() + SomeMethod2(),
        2 => -1,
        _ => input
    };

// Guard clause approach (which I suspect will be less efficient)
public int Foo(int input) => input switch
    {
        _ when input == 0 || input == 1 => SomeMethod1() + SomeMethod2(),
        2 => -1,
        _ => input
    };

If this feature is adopted, or something similar, we get the much more pleasant:

public int Foo(int input) => input switch
    {
        0 or 1 => SomeMethod1() + SomeMethod2(),
        2 => -1,
        _ => input
    };
Neme12 commented 6 years ago

It looks like this was briefly discussed in LDM: https://github.com/dotnet/csharplang/blob/master/meetings/2018/LDM-2018-03-19.md#and-or-and-not-patterns-1350

If this is outside the scope for C# 8.0, will we get the in pattern instead or some other way of matching multiple patterns inside a switch expression? If not, that could be a significant limitation forcing us to keep using the switch statement in many cases.

Truerror commented 6 years ago

Is this really important? What are the benefits offered by this compared to the standard operators (||, &&, and !)?

jskeet commented 6 years ago

@Lifeburner: Those operators don't work to compose patterns, and couldn't do so without more work, as per Neal's if (myBool is true || false) example. Fundamentally this is composing patterns rather than values, which I think is an important distinction.

Neme12 commented 6 years ago

@Lifeburner I don't understand. How would you replace something like

x switch
{
    1 or 2 => "hello",
    _ => "world"
}

with ||?

Neme12 commented 6 years ago

Are you suggesting or should just be ||? That would not only be ambigious with respect to syntax but there's also an important distinction in the semantics.

switch (false)
{
    case true or false:
        // Yay, this executes because false is indeed true or false!
        break;
}
switch (false)
{
    case true || false:
        // Nope! false does not equal true || false (which evaluates to true)
        break;
}
Neme12 commented 6 years ago

What about the ambiguity with not: not identifier is currently a declaration pattern with not as the type. This would have to semantically disambiguated, which is a little unfortunate because we'd like to have the distinction in the syntax. Or would it be OK to break compat here?

Neme12 commented 6 years ago

There would be similar issues with recursive patterns: not { Property: 0 } is this a not pattern with a recursive pattern inside? Or a recursive pattern with type not?

gafter commented 6 years ago

@Neme12 The compiler can detect the presence of a type named not and either report an error or adjust its behavior.

Neme12 commented 6 years ago

@gafter Yes but this cannot happen during parsing I assume? So not identifier would need to have the same syntax as a declaration pattern (in order to preserve compatibility), which will be hard to deal with in many cases.

gafter commented 6 years ago

@Neme12 Yes, it would be a breaking change for programs that use not as a type name in a pattern. We take small breaking changes frequently, and this one doesn't sound particularly bad.

AustinBryan commented 6 years ago
if (neither x nor y are null, yet z is 4 and b is a string, however b is still null)
{
}

Much better 😂

AustinBryan commented 6 years ago

In all seriousness though, I think the new keywords are worth the breaking change because how many people are really using those words anyways, for a name? And I'd rather be able the compiler to find which or means boolean or, and which one means pattern matching or, without having to use parenthesis that could make things look complicated, more ugly and hide what's supposed to be going on.

And programming languages haven't cared too much about ambiguity for the developer in the past. There's = and ==, and some languages ===. I got wicked confused over & and && when I first started. Also, if you're worried about more keywords, you could do more operators:

switch (o)
{
    case 1 ||| 2:
    case Point(0, 0) ||| null:
    case Point(var x, var y) &&& var p:
}

Though the new keywords are more clean.

jnm2 commented 6 years ago

Will parentheses be able to be used on the patterns?

if (x is not (Foo _ or Bar _)) { }

if (x is (not Foo _) or Bar _) { }
gafter commented 6 years ago

@jnm2 We'd have to decide whether some mechanism such as parentheses could be used for grouping patterns to override precedence.

alrz commented 6 years ago

This has been mentioned elsewhere, just bringing this up in this relevant thread.

Since identifiers are resolved as constant patterns except for the top-level pattern in the is operator, with the introduction of these patterns we'll end up with some paperscuts,

if (x is Foo)
if (x is not Foo _) 

if (x is Foo)
if (x is Foo _ or Bar _)

Could we say we fallback to a type if the constant lookup failed?

if (x is Foo or Bar)
if (x is not Foo)

We can already parse types as expressions (e.g. in nameof) so it seems like the parser could handle this just fine.

gafter commented 6 years ago

@alrz Seems plausible.

yaakov-h commented 6 years ago
class C
{
    bool M(object o)
    {
        if (o is not Foo)
            return true;
        else
            return false;
    }
}

class not
{
}

What would happen here? Does not become contextual?

alrz commented 6 years ago

@yaakov-h

Note that the ambiguity exists even with current rules, for example if there's a constant Foo in scope. There's a way around it just like how we handle nameof (parse as-is, bind as a declaration pattern), however, if we want to also preserve the shape of the syntax tree at this point, it gets a little more involved and I don't think it would worth it.

ericsampson commented 5 years ago

There are many existing issues that could be closed if this is implemented; would it be possible to start cleaning them up somehow? I can help if that would be useful.

gafter commented 5 years ago

There are many existing issues that could be closed if this is implemented; would it be possible to start cleaning them up somehow? I can help if that would be useful.

Those issues could not be closed if this is not implemented. So... since the future is unclear, how are you suggesting we proceed?

ericsampson commented 5 years ago

There are many existing issues that could be closed if this is implemented; would it be possible to start cleaning them up somehow? I can help if that would be useful.

Those issues could not be closed if this is not implemented. So... since the future is unclear, how are you suggesting we proceed?

Is there a way to link them as 'would-be-resolved-by' or 'superseded by' this proposal? IIRC Bugzilla has functionality like this to help dedup issues, but I'm not sure if you folks are using this kind of tag/linkage. Thanks!

Thaina commented 5 years ago

Maybe we should change syntax for not into not(TYPE) like typeof ?

HaloFour commented 5 years ago

@Thaina

Why? That would unnecessarily limit the not pattern to type switch. There are many kinds of patterns and this would allow negating any of them.