Closed rmorshea closed 3 years ago
There is a note in the stubs:
This needs to be amended about it being potentially None
in case of strings. Overloads and literals should be able to narrow this down. PR welcome.
@rmorshea Are you working on it? I can submit a patch if you want
@srittau When I am trying to replicate it since return type of re.match
is Optional[Match[AnyStr]]
, mypy complains about calling .group
on None
is there any way to handle this scenario?
@lladhibhutall This should use overloads, e.g. the return type should depend on whether an argument is given, it is 0, another int or a str.
Note it can return None for (non-zero) numbered groups as well: assert re.match("(x)?y", "y").group(1) is None
. This change, while correct, might hurt ergonomics in the happy case, pretty much all uses of numbered groups I've seen assume if there's a match their groups will be not None.
IMO having a false positive (i.e. the return type is Optional[str]
for required match groups) is better than a false negative, but if the maintainers want to keep as-is to avoid inconvenience I'd understand that as well.
typeshed generally prefers false negatives over false positives (https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#what-to-include), but maybe Sebastian has an opinion on whether to make Optional the return type of zero, one or both of named and (non-zero) numbered groups.
It looks like the change was made, and then reverted in https://github.com/python/typeshed/pull/3190
Ok, seems like false-positive caused problems. Thanks @hauntsaninja.
There's also some hope a plugin could improve results here, e.g.: https://github.com/python/mypy/pull/7803
This is quite needed for mypyc
users. Can this issue be re-opened?
Reopening. I believe we should strive towards correctness here and we can ensure this at least somewhat using overloads and literals.
An ideal solution for mypy users here is a plugin (https://github.com/python/mypy/pull/7803).
The no argument case and the Literal[0]
case should return str
; this should be uncontroversial. The multiple argument case should return a tuple of whatever we decide the single argument case should return.
However everything else, which comprises most of the usage of group
, can potentially return None. If you assume that non-optional capture groups are more common than optional capture groups, this is potentially a lot of false positives.
As discussed in https://github.com/mypyc/mypyc/issues/676, one option here is to mark the return type as Union[Any, str]
. This would sidestep the false positive problem, and since mypyc was mentioned, fix things for mypyc users. This option looks increasingly good to me!
Anecdotally, I literally wrote code today that would trigger a false positive if we instead went with Optional[str]
. But a quick scan of https://grep.app/search?q=.group%28&filter[lang][0]=Python seems to indicate at least half of all usage of group
would get flagged incorrectly.
Basedmypy works correctly in these cases:
import re
s: str
if m := re.match("a(b)?(c)", s):
reveal_type(m.groups()) # (str | None, str)
All usages of
Match.group
appear to return strings however, for optional named groups it can return None: