dotnet / csharplang

The official repo for the design of the C# programming language
11.61k stars 1.03k forks source link

Champion "and, or, and not patterns" (VS 16.8, .NET 5) #1350

Open gafter opened 6 years ago

gafter commented 6 years ago

The idea is to add three new pattern forms

  1. pattern and pattern
  2. pattern or pattern
  3. not pattern

The latter two would not be permitted to define pattern variables in the subpatterns (as they would never be definitely assigned).

We'd have to decide on syntax and precedence, and whether some mechanism such as parentheses could be used for grouping patterns to override precedence.

Examples:

switch (o)
{
    case 1 or 2:
    case Point(0, 0) or null:
    case Point(var x, var y) and var p:
}
Thaina commented 5 years ago

@HaloFour My point is having syntax contain parentheses to make it not ambiguous. It might be any valid pattern but need to have parentheses for not

HaloFour commented 5 years ago

@Thaina

Given recursive positional patterns and support for womples I don't see how the parenthesis would eliminate that ambiguity.

TKharaishvili commented 5 years ago

Ok so here's my genius idea as to why this is not necessary(just kidding with self-praising). We don't have this in C#:

if (10 < x < 100)

We write:

if (10 < x && x < 100)

And that's just fine. Following the same logic we can just write:

if (x is 1 || x is 2) ...

if (o is IDictionary d && o is IEnumerable e) ...

or in the context of switch:

switch (o)
{
    case 1:
    case 2:
    {
        //deal with it when 1 or 2
        break;
    }
    case Point(var x, var y) when o is var p:
    {
        //deal with the "and" case
        break;
    }
}

Ok so that last one is a bit more verbose than others but I think this and thing is going to be rare anyway.

gafter commented 5 years ago

@TKharaishvili Can you please show how that would work in a switch expression, for example to make a case that handles all letters in a range? In my proposal I would write `ch switch {

= 'a' and <= 'z' => ch - 'a', = 'A' and <= 'Z' => ch - 'A', _ => -1 }`

bondsbw commented 5 years ago

@gafter I've always supported this proposal, but that last example is making me rethink it...

HaloFour commented 5 years ago

It helps to include some whitespace there:

var res = ch switch {
    >= 'a' and <= 'z' => ch - 'a',
    >= 'A' and <= 'Z' => ch - 'A',
    _ => -1
};

It is unfortunate that the lambda operator is so similar to existing comparison operators. This specific example also almost makes me wish the case keyword was involved.

TKharaishvili commented 5 years ago

@gafter not sure 'cause I haven't played around with this expression style switch yet. Sorry.

alrz commented 5 years ago

It is unfortunate that the lambda operator is so similar to existing comparison operators. This specific example also almost makes me wish the case keyword was involved.

I think parenthesized patterns could help with the readability of things,

(>= 'a' and <= 'z') => ch - 'a',

But it'd be nice to be able to switch between postfix/prefix forms (Haskell supports this in expressions)

 ('a' >= and <= 'z') => ch - 'a',

Although, I believe we're better off with a inclusive range pattern for these common cases,

case 1: case 2: case 3: break;
// simplifies to
case 1...3: break;

If you want more control over clusivity or you just want a single comparison, you could fallback to use relational patterns.

HaloFour commented 5 years ago

@alrz Yes, having parenthesized patterns and prefix as well as postfix relational operators does go a long way to improve the readability of that expression.

GGG-KILLER commented 5 years ago

About @alrz's expression (('a' >= and <= 'z') => ch - 'a',), is that interpreted as (ch >= 'a' and ch <= 'z') or ('a' >= ch and ch <= 'z')?

The reason I'm asking this is because I don't know if others automatically corrected in their head or I have the wrong impression about how this will work but his use of >= instead of <= when swapping the sides of the comparison changes the meaning of it and no one seems to have pointed it out (yet?).

Thaina commented 5 years ago

@alrz What you intended should be written as ('a' <= and <= 'z') => ch - 'a'

But I don't think we should support this. It making syntax parsing process become much more complicate

gulshan commented 5 years ago

I have a query. With all those efforts went and still going to pattern matching, how much it is being used in real codebases? I think switch expression will increase the usage though.

jnm2 commented 5 years ago

@gulshan I think of the codebases I contribute to as real codebases. =)

gafter commented 5 years ago

We are using pattern-matching more and more in Roslyn. The most commonly seen is s is null, which replaces (object)s == null which is what we used to write to avoid the user-defined operator ==. We are using the new recursive patterns now too, and it really simplifies a lot of situations.

HaloFour commented 5 years ago

@gulshan

The pattern matching support that shipped in C# 7.0 was very limited and mostly replaces existing null reference checks and as for type switching. I think recursive patterns will dramatically increase the number of cases where pattern matching will significantly reduce code, and I also think that switch expressions will help to drive adoption as well.

I'm very excited as to how much work is going into pattern matching in C# and I hope that it doesn't slow after C# 8.0 is released. I very much look forward to further support for active patterns, discriminated unions, AND/OR patterns, list patterns, range patterns, relational patterns and parenthesized patterns.

TonyValenti commented 5 years ago

It would be awesome if, when this gets implemented, I could write the following:

if(a or b is 1 or 2){

}
HaloFour commented 5 years ago

@TonyValenti

I'd imagine that they could only be used within the pattern part of the expression. Otherwise it could explode into a Cartesian product of possible comparisons, especially in a switch statement/expression:

var result = (a or b or c or d or e) switch {
    1 or 2 or 3 => "foo",
    4 or 5 or 6 => "bar",
    7 or 8 or 9 => "baz"
};

Assuming that were possible, what would the result be if a was 9 and b was 1?

TonyValenti commented 5 years ago

@HaloFour - I'm thinking about things in terms of sets, unions, and intersections.

For example, a or b is 1 or 2 would essential be equivalent to ({a} u {b}) n ({1} u {2}) != { }

Using the logic above, I would expect switch to pick the first match. So in that case, the answer would be "foo".

u7pro commented 5 years ago

To catch specifc exceptions and wrap them in a functional one, we can write code like this

catch (X e)
{
  throw new MyFunctionalException("functional message", e);
}
catch (Y e2)
{
  throw new MyFunctionalException("functional message", e2);
}

Starting with C# 6, we can write it like this :

catch (Exception e) when (e is X || e is Y)
{
  throw new MyFunctionalException("functional message", e);
}

using or pattern, it can be written like this :

catch (Exception e is X or Y)
{
  throw new MyFunctionalException("functional message", e);
}

with a concrete example from NetworkStream : https://github.com/microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System/net/System/Net/Sockets/NetworkStream.cs#L862

            try {
                int bytesTransferred = chkStreamSocket.EndReceive(asyncResult);
                return bytesTransferred;
            }
            catch (Exception exception) {
                if (exception is ThreadAbortException || exception is StackOverflowException || exception is OutOfMemoryException) {                                       
                    throw;
                }

                //
                // some sort of error occured on the socket call,
                // set the SocketException as InnerException and throw
                //
                throw new IOException(SR.GetString(SR.net_io_readfailure, exception.Message), exception);
            }

using is not + or pattern, it can be rewritten like this :

            try {
                int bytesTransferred = chkStreamSocket.EndReceive(asyncResult);
                return bytesTransferred;
            }
            catch (Exception exception is not ThreadAbortException or StackOverflowException or OutOfMemoryException) {
                //
                // some sort of error occured on the socket call,
                // set the SocketException as InnerException and throw
                //
                throw new IOException(SR.GetString(SR.net_io_readfailure, exception.Message), exception);
            }
gafter commented 5 years ago

catch (Exception e is X or Y)

No. X is not a pattern if X is a type. Perhaps you mean

catch (Exception e) when e is X _ or Y _

Thaina commented 5 years ago
catch (Exception e is X x or Y y)

Seem like it not possible

We can't make sure which x or y are initialized. This come into the realm of union type

catch (Exception e is (X | Y) xy)
bondsbw commented 5 years ago

@gafter To Thaina's point... based on your example, would discards be allowed in such contexts where variables would not be definitely assigned?

edit: wildcard -> discard

gafter commented 5 years ago

A discard can always be used in a pattern instead of an identifier to name a pattern variable.

bondsbw commented 5 years ago

@gafter My question is more about the "definitely assigned" aspect. Would discards be allowed for or patterns, given that identifiers are not allowed there?

(Since discards aren't assignments and therefore definite assignment is irrelevant?)

alrz commented 5 years ago

Since o !is C c is accepted today I think it would be useful to allow variables in a top-level not pattern to cover the use case for a negative is expression.

if (e is not C c) return;
// equivalent to
if (!(e is C c)) return;
// still illegal as originally proposed
if (e is (C a, not C b))

Thoughts?

u7pro commented 5 years ago

catch (Exception e is X or Y)

No. X is not a pattern if X is a type.

@gafter : It's trivial that X here is an exception type, not a pattern.

Perhaps you mean

catch (Exception e) when e is X _ or Y _

using discards make it more noisy, if we could write it like this, it'll be more readable :

catch (Exception e) when e is X or Y

or even more naturally, if it can be written like this :

catch (Exception e is X or Y)

using variables x and y as mentionned by @Thaina will lead to unassigned variables, so only discards should be allowed, and since then, we can imagine to drop them, without ambiguity, no ?

gafter commented 5 years ago

@u7pro There is no proposal (not this proposal or any other that I am aware of) to make the is-type operator accept an or in the type, nor is there any proposal to extend the syntax of the catch parameter. If you want to make such proposals, please create new issues.

u7pro commented 5 years ago

Ok @gafter, thanks for your comments, I thought I could profit of the introduction of or, and and not operators to patterns to extend other areas like the is operator and the catch syntax

but, as you suggested, I will later create a new issue to propose it

ErikHumphrey commented 5 years ago

The in disaster should be enough of a lesson for the reason why.

@MgSam Context for this "in disaster"?

GeraudFabien commented 5 years ago

@MgSam I would like to know too. I never read or watch a drawback for "in". And i use it a lot...

stop adding brand new keywords for minor use cases

If you talk about the new 7.2. 'In' was not realy a new keyword...

Joe4evr commented 5 years ago

I never read or watch a drawback for "in". And i use it a lot...

For one thing:

using in outside of its intended place (being large value types) actually generates a lot more overhead, often leading to worse performance, and in the worst case corrupting your memory.

jnm2 commented 5 years ago

Nothing an analyzer wouldn't fix, right?

GeraudFabien commented 5 years ago

Corrupting your memory

It's not the first time a compiler as a bug. If we have to remove every language feature that was more complex than intended. We won't event have operator like "==" today.

worse performance

Most of the feature that allow to gain performance make you loose performance in some case if no most case. For instance try to make a string like that string myString = new StringBuilder("myString").ToString(); I know it's dump to do that. But you also have MY_CONST == myVar myVar == MY_CONST myVar.Equal(MY_CONST) The three may not have the same result and performance even if by reading all three you expect to have the same result...

It's also true with nearly evry feature i could thinks off (foreach [vs for], for [vs while], While vs [goto], == [vs .Equals(), ...], ...). I do not agree that this feature should be done. I thinks it's useless and may lead to complex code to uderstand in most of case because it does't looks like C# operator.

But i do not thinks the problem's with 'in' have anythinks to do with this thread.

adamjstone commented 4 years ago
if (neither x nor y are null)
at which hour (x and y existeth)
declard commented 4 years ago

Pattern matching is about structural decomposition of values themselves, e.g. if a list was build as Cons(1,Cons(2,Const(3,Nil))) then it can match a pattern Cons(_,Cons(_,Cons(_,_))), and propositional things like equality, ordering and inequality have nothing to do with patterns at all. They are computations, not patterns, hence should be described as guard expressions like when which already exists in the language.

Adding pattern combinators would be worth if patterns were first-class objects in the language, so you could build a pattern like

pattern Px(int v) = IFace1 { Field1 = v };
pattern Py(string v) = IFace2 { Field2 = v };
pattern P(int x, string y) = Px(x) and Py(y);

from more primitive blocks using such combinators and then feed it, for example, to a switch statement:

class MyClass : IFace1, IFace2 { ... }
...
switch (new MyClass())
{
  pattern P(x, y) => x + y;
}
circles-arrows commented 4 years ago

Since testing against null and against not null is a very common use case, how about also allowing a short hand for if (e is not null) ... which would read if (e not null) ...

Happypig375 commented 4 years ago

@circles-arrows What about #882: if !(e is null) ...

theunrepentantgeek commented 4 years ago

@circles-arrows we already have short syntax for that:

if (e is object) 
{
}

IIRC, @jaredpar has been evangelizing this exact idiom. (Apologies if I've mis-attributed this, Jared).

333fred commented 4 years ago

IIRC, @jaredpar has been evangelizing this exact idiom. (Apologies if I've mis-attributed this, Jared).

Don't worry, you have not.

circles-arrows commented 4 years ago

@Happypig375 It's not that I'm against if !(e is null), if (e is not null), if (e not is null) or if (e is object), although all of them have something that makes them slightly less than perfect. Anyway, I just liked the simplicity of the "not" pattern being the opposite of "is".

@theunrepentantgeek Ah I didn't realize that if (e is object) ... would also be true if e is for example an int or struct. Thanks for pointing that out!

So here my reservations: With "is object", in that I would not expect it to be true for an primitive or struct, and therefore I find it confusing at first glance. Which is why it needed to be pointed out by theunrepentantgeek, even though I saw that code before. I never realized I was looking at another way of writing "is not null".

With "not is" or "is not" it seems like we're introducing an alias for ! and "!is" or "! is" seems more logical then, but also seems quite unreadable... Also in different SQL dialects I can never remember which way around it works (not is or is not) and if the other way around it also would have worked.

Then for if !(e is null), that will not work with pattern matching unless you put parentheses around the pattern, right? Also, compiler/parser wise it's an odd exception case for "if" statements not needing parenthesis around the conditions is this case. Or would I be able to also write if e is null ... now?

So then for if (e not null)... I'm not natively English speaking, but I can Imagine for someone who is that this sounds as horrible as it sounds elegant to me. If that's the case, just forget I ever mentioned it...

ANYWAY, hope these way to many thoughts helped anyone, because for me I'm still as confused as before about what would be the best design choice here...

Happypig375 commented 4 years ago

@circles-arrows With property patterns, there is also if (e is {}) for testing non-nullness.

hez2010 commented 4 years ago

If you want to introduce union and intersection types later, these patterns will become valid: TA | TB, TA & TB. So why not use |, & but or, and instead?

x switch
{
    TA or TB => ...
}

is exactly same as


x switch
{
    TA | TB => ...
}
CyrusNajmabadi commented 4 years ago

@hez2010 because | and & have meaning for expressions, and expressions can be contained in patterns, thus creating an ambiguity.

Eirenarch commented 4 years ago

Sorry I didn't read all the comments. Am I right to assume that this if implemented would cover

if (x is not SomeType y)
{
    throw SomeException();
}

//use y here

Just today I was quite annoyed that I can't do this and had to do the ugly if (!(x is SomeType y)). I reverted to using the as operator which made the code cleaner but the style checker complained that I should use pattern matching. I can change the rule but then the IDE won't tell me about all the other places I can use pattern matching and actually want to. I really really wish I had the ability to use not in pattern matching.

333fred commented 4 years ago

That's covered by https://github.com/dotnet/csharplang/issues/3369

Dimension4 commented 4 years ago

What about &&& and ||| for intersections and unions?

333fred commented 4 years ago

Between introducing yet another & operator and using and, the latter at least has less opportunity for confusion.

Dimension4 commented 4 years ago

Apologies, I didn't expect someone would take my comment serious. So for the record: I think A &&& B is terrible syntax.

sgf commented 4 years ago

easy way is:

switch (o)
{
    [1 || 2]:
    [Point(0, 0) || null]:
    [Point(var x, var y) && var p]:
}

when [..||..] ,the || will not Short circuit

alrz commented 4 years ago

I think if we really wanted to go with character operators, the obvious choices were , , and ¬.

x is 1 ∨ 2
x is > 0 ∧ < 20
x is ¬ null

No one would EVER complain about these.