gvanrossum / patma

Pattern Matching
1.03k stars 65 forks source link

Rules for or-patterns and guards #130

Closed stereobutter closed 4 years ago

stereobutter commented 4 years ago

After stumbling upon this blog entry I wonder if the following behavior (example copied from the blog) is intentional or just an oversight?

def doesEitherCatHaveAnEvenAge(pet1, pet2):
  match (pet1, pet2):
    case (Cat(age=age), _) | (_, Cat(age=age)) if age % 2 == 0:
      return True
    case _:
      return False

print(doesEitherCatHaveAnEvenAge(Cat(2), Cat(4)))
# prints True

print(doesEitherCatHaveAnEvenAge(Cat(2), Cat(5)))
# prints True

print(doesEitherCatHaveAnEvenAge(Cat(5), Cat(4)))
# prints False (?????)
dmoisset commented 4 years ago

The or pattern matches left to right. The guard is done after the whole pattern succeeded. That pattern always matches the first cat. Your function computes "does the first cat have an even age?"

stereobutter commented 4 years ago

So for the whole thing to work as intended by the author it would have to be spelled:

def doesEitherCatHaveAnEvenAge(pet1, pet2):
  match (pet1, pet2):
    case (Cat(age=age), _)  if age % 2 == 0:
      return True
    case (_,  Cat(age=age)) if age % 2 == 0:
      return True
    case _:
      return False

?

gvanrossum commented 4 years ago

Yes. But it’s not a great example of using match.

brandtbucher commented 4 years ago

Yeah. It's important to remember that, in our model, |-patterns can be arbitrarily nested inside of other patterns:

case Foo(1 | 2, x) | Bar(3 | 4, x) if x.baz(): ....

The behavior you describe only feels natural when they are limited to denoting top-level alternatives, which is much less flexible:

case Foo(1, x) | Foo(2, x) | Bar(3, x) | Bar(4, x) if x.baz(): ....

Otherwise, the backtracking can get nightmarish very easily.

stereobutter commented 4 years ago

@brandtbucher You are of course absolutely correct that backtracking becomes very cumbersome fairly quickly. I just wonder how/if this non-backtracking behaviour should be documented. For me this is similar to def func(x=[])... in that once you know why the behaviour is the way it is things are obvious but to the uninitiated the behaviour is baffling. An ambiguity warning like in ocaml would solve this issue but I don’t know if this is possible with the information the parser has?

thautwarm commented 4 years ago

The kind of use of or-patterns and guards is very common in some fields. A famous example is type unification or type checking, when we unify or check a pair of types, the pair order is insignificant, and we also want to avoid duplicating the code.

For instance, following code can be simplified with pattern matching

https://github.com/RemuLang/hmf/blob/97444195f2659ec7f0b5e64a1678d5b024434a3c/hmf/infer.py#L167-L176

into

match t1, t2:
    case …:
        …
    case ty2 := Var(ref=Bound(ref2)), ty1 | ty1, ty2 := Var(ref=Bound(ref2))
         if ref2.id is not special_var_id:
        occurs_check_adjust_level(ref1l2.id, ref2.level, ty1)
        ty2.ref = Link(ty1)

The guard here adds a special inference rule for some specific variables, for instance, in Haskell there is a frequently used variable $, whose type inference is specially treated to avoid lossing polymorphisms in its regular uses.

gvanrossum commented 4 years ago

The non-backtracking behavior is very clear from specification and always has been, and I don't see how you could have interpreted it otherwise. Everywhere guards are brought up it is said they are a test applied after the pattern succeeds. An the semantics of patterns themselves is very clearly first-match succeeds.

Taine, I'm not going to try to understand your example, but I hope you meant that it doesn't need backtracking. Also note you have a syntax error -- the guard cannot start on a new line.

thautwarm commented 4 years ago

Taine, I'm not going to try to understand your example, but I hope you meant that it doesn't need backtracking.

Yes. I'm actually explaining "this behavior is natural".

I think maybe people sometimes misunderstand case (Cat(age=age), _) | (_, Cat(age=age)) if age % 2 == 0 as case ((Cat(age=age), _) if age % 2 == 0) | ((_, Cat(age=age)) if age % 2 == 0)(though we don't have this syntax).

Also note you have a syntax error -- the guard cannot start on a new line.

Well, just to avoid being too long.

@SaschaSchlemmer

An ambiguity warning like in ocaml would solve this issue but I don’t know if this is possible with the information the parser has?

This is possible, actually not difficult at all, the reference implementation already tracked commonly captured variables.

However, FYI, F# does not emit this warning. I think F#'s concerns apply better to other industrial languages.

Incidentally, last month I just suppressed OCaml warning 57 in my codebase, because every one thinks it redundant and counterintuitive..

stereobutter commented 4 years ago

The non-backtracking behavior is very clear from specification and always has been, and I don't see how you could have interpreted it otherwise. Everywhere guards are brought up it is said they are a test applied after the pattern succeeds. An the semantics of patterns themselves is very clearly first-match succeeds.

This is precisely what I meant when I likened the misunderstanding to def func(x=[]): .... The behavior for both phenomena is well defined and easy to understand once explained in the context of the offending example, but not necessarily obvious on first glance just from reading the specification. So instead of users having to go through the process of actively making the same mistake and learning what the specification actually means a warning might be good UX.

gvanrossum commented 4 years ago

Are you seriously proposing a runtime (or compile time, which is the same thing in Python) warning? If so can you write up a precise proposal for when you'd like that warning to appear? Many users despise warnings so it would be annoying if warnings started popping up even if the user has understood the semantics and is counting on them. I suppose in the end-user docs that will eventually be written we could add a note that there is no backtracking once the guard is reached. (Also, which language backtracks if the guard fails?)

stereobutter commented 4 years ago

Sorry for not being clear on what I meant with warning. I didn't mean a warning like a deprecation warning that is issued by the interpreter but rather wondered if this would be hard for a linter (that maybe uses the same parser) to detect and issue a warning.

stereobutter commented 4 years ago

Also Mathematica does backtrack*

f[{x_, _} | {_, x_}]:= True /; Mod[x,2] == 0
f[{_, _}]:= False

f[{1, 2}] (* True *)
f[{2, 1}] (* True *)
f[{1, 1}] (* False *)
Quick syntax translation: mathematica python
function definition f[x_]:= def f(x): ...
function call foo(bar) foo[bar]
list/tuple {1, 2} (1, 2)
named pattern x_ case x ...
guard ... /; expression case pattern if expression

*to be precise the Condition operator /; is defined as

patt/;test is a pattern which matches only if the evaluation of test yields True.