Open gvanrossum opened 4 years ago
I am also fine with this, and think it actually has merit to point out obvious mistakes.
And to be fair: some syntax errors cannot be sensibly expressed by a context-free language, anyway. Take the rule that no two parameters in a function can share the same name, for instance. So, we do not necessarily have to put that into the grammar.
I'm okay with this as well. People ignore warnings, and I cannot imagine this ever being "correct" (except perhaps in poorly-generated code).
If we ever add an else clause, an irrefutable case must not be followed by any else clause. #147
an irrefutable case must not be followed by any else clause
Generally, I agree with this. But I would be careful with the wording. Irrefutable cases can actually take quite complicated forms, making it impossible for the compiler to decide whether they really are irrefutable or not. So, in short: the rule as it stands cannot be checked.
Perhaps it would be enough to talk about the irrefutable case and define somewhere that it refers to the case <NAME>
...?
Hm, in my naïveté I think irrefutability is easily checked recursively.
That’s it. Presence of a guard makes a case non-irrefutable.
What am I missing?
I was primarily thinking of the last case you mention:
Presence of a guard makes a case non-irrefutable.
This is not necessarily the case semantically as in, e.g., case x if 2 < 3:
. It has a guard, however, it will always succeed, hence it is semantically irrefutable, although it might be impossible for the compiler to discover this for all possible conditions in a guard.
The second objection to the list would be class patterns. case object():
is also irrefutable, and if we allow the isinstance
to be overwritten, any class C
might end up being irrefutable.
In the context with OR, I would think of a silly example like case (mutable() | immutable()):
, which I would claim is also irrefutable.
Long story short: what we need is syntactic irrefutability (irrefutability that can be decided purely based on the syntax), which is naturally a subset of semantic irrefutability (everything that is 'logically' irrefutable). Obviously the compiler can only really check the former.
Personally, I would not have a problem with defining what we mean by syntactic irrefutability and then state that any such case must be the last branch in a match statement (including the potential else:
). But come to think of it, I am a bit afraid that it would add another set of rules to the PEP, which is already perceived as having too many rules... So, the question is: is it really worth the trouble? Or do we just go with the irrefutable case (where the <NAME>
in my post was meant to include the wildcard, of course)?
EDIT: We could also just say that the compiler will reject any case branch (again, including else
) that follows a case for which it can prove that it is irrefutable. In other words: the compiler will complain about anything following syntactic irrefutability, but leave it open to the exact extent of syntactic irrefutability. The only condition is that a simple name (or wildcard) is in this set.
What speaks against this, of course, is that if the compiler gets better in detecting such cases we loose backward-compatibility. A program that runs with Python 3.10.0 might not run with 3.10.1 because the latter rejects an unreachable case branch. If pattern matching is used in libraries, this might lead to quite a mess. Hence not a good idea!
I think it is worth it defining which patterns are syntactically irrefutable (basically my definition) and stating that any guard-less case whose pattern is syntactically irrefutable must not be followed by another case
or else
.
Compilers that are smarter than that can issue SyntaxWarnings.
Note that case object():
may not be irrefutable if a global variable object
exists or if builtins.object
has been replaced.
I am not worried that adding this rule will be considered as a problem (and if it is, we can offer only to declare case NAME:
to have this requirement).
Having an or-pattern where one in the left is irrefutable is a good candidate for errors too. For example in:
match value:
case None: ...
case 0: ...
case MAX_VALUE|MIN_VALUE: ... # I intended to use these names as constants here
Agreed.
Hm, in my naïveté I think irrefutability is easily checked recursively.
- a single capture pattern is irrefutable
- a wildcard is irrefutable
- an OR is irrefutable if any of its sub-patterns is
- a walrus is irrefutable if the RHS is
That’s it. Presence of a guard makes a case non-irrefutable.
Does this sound good, then? If so, I'll update the spec and implementation.
Yup, still sounds good to me. Let's do it.
See language in https://github.com/python/peps/pull/1649 -- please review.
(Landed.)
Just got back from a trip. Sorry for the delayed review... but this looks good to me anyways! My one lingering question about patterns like case _ | _
was addressed in the linked PR: as expected, this too is a syntax error.
Labeling as fully pepped -- this is now incorporated in PEP 634. (And also implemented, thanks Brandt!)
There were some concerns over the typical footgun of writing
(with the intention that
FOO
andBAR
are global constants, not capture patterns).The CPython implementation issues a
SyntaxWarning
for this. Maybe we should just make that aSyntaxError
. (Even though it's hard to express even in the PEG grammar.) Mark Shannon also wants this.I am fine with this.