TypingError: type checker will emit error with a message specified in typing stubs

Akuli commented 2 years ago

Some ways to call certain functions are just wrong, and are known to be a bug. For example, passing type=bool to argparse's add_argument method doesn't work. To make type checkers warn about this, typeshed can currently do something like this:

@overload
def add_argument(self, unrelated args, type: Type[bool], more unrelated args) -> NoReturn: ...
@overload
def add_argument(self, unrelated args) -> None: ...

When add_argument is used wrongly, you would then get errors:

import argparse
parser = argparse.ArgumentParser(description="My parser")
parser.add_argument("--my_bool", type=bool)
print(parser.parse_args())  # Error: statement is unreachable

This approach has a couple problems:

The type checker's error message doesn't say "you passed in type=bool and it's wrong". It says "statement is unreachable".
The type checker's error message appears on a different line than where the buggy code is.
Unreachable code warnings are disabled by default in mypy, even with --strict.

The solution I would like: if you declare the return type as TypingError["foo bar"], the type checker will display an error message foo bar. This is similar to how #error works in the C preprocessor.

Stub:

@overload
def add_argument(
    self, unrelated args, type: Type[bool], more unrelated args
) -> TypingError["type=bool doesn't work, see https://stackoverflow.com/q/15008758"]: ...
@overload
def add_argument(self, unrelated args) -> None: ...

Python file:

import argparse
parser = argparse.ArgumentParser(description="My parser")
parser.add_argument("--my_bool", type=bool)  # Error: type=bool doesn't work, see https://stackoverflow.com/q/15008758
print(parser.parse_args())

erictraut commented 2 years ago

How common is this use case? Have you seen more than just this one case?

AlexWaygood commented 2 years ago

Here are two other places in typeshed where we have NoReturn overloads that kinda-sorta-half work, where this feature would be useful:

builtins.pow (first overload): https://github.com/python/typeshed/blob/fabd8428572dcc20b7ae7919e16d9b4c87cf3da2/stdlib/builtins.pyi#L1330
datetime.date.__sub__ (second overload): https://github.com/python/typeshed/blob/fabd8428572dcc20b7ae7919e16d9b4c87cf3da2/stdlib/datetime.pyi#L72

I'm sure there are yet more examples, but these two are just off the top of my head.

Akuli commented 2 years ago

A couple more examples where unreachable code warnings don't do a good job of explaining what's wrong:

wave.Wave_read.getmark and a few other wave methods unconditionally raise an error.
mmap objects have a __delitem__ that always raises TypeError: mmap doesn't support item deletion.

erictraut commented 2 years ago

Thanks for the additional examples. That's really helpful.

If we were to add support for this, I'd prefer not to model it as a return type annotation for a couple of reasons.

First, I want to avoid adding more type annotations that accept string literals as type arguments but do not expect the string to be treated as a quoted type expression. Both Literal and Annotated already break this rule, and type checkers like pyright need to include a bunch of complex and fragile special-case handling for these cases. Let's not add more of these unless there's a really compelling reason to do so.

Second, the return type seems like it should be independent of any special error message that is emitted. In some cases, NoReturn is the correct return type because that overload will always generate an exception. In that case, code after the call should be marked unreachable. Other times, a different return type may be appropriate, but it's still desirable to emit an error message.

I think a decorator is a better option for this particular use case. There have been some good discussions around a standardized decorator for @deprecated. This feels like a very similar use case, and I think we could address both use cases with the same solution/pattern.

Here's a rough idea of what that might look like.

from typing import overload, deprecated, runtime_error

# Use of "runtime_error" decorator, always emits error diagnostic
@overload
@runtime_error("Calling pow() with a mod of 0 will result in a runtime exception")
def pow(base: int, exp: int, mod: Literal[0]) -> NoReturn: ...
@overload
def pow(base: int, exp: int, mod: int) -> int: ...
@overload
def pow(base: int, exp: _PositiveInteger, mod: None = ...) -> int: ...

# Use of "deprecated" decorator; this can be presented differently if
# desired (e.g. crossed-out text at the call site)
@overload
def foo(x: int) -> int: ...
@overload
@deprecated("Option value of 'old' is deprecated")
def foo(x: int, *, option: Literal["old"]) -> int: ...
@overload
def foo(x: int, *, option: str) -> int: ...

Thoughts?

Akuli commented 2 years ago

I like the idea of using a decorator, and also adding a similar @deprecated decorator while we're at it. We could combine the two into just a single decorator, but separating them means that users can disable deprecation warnings without also having to suppress errors for type=bool and the like. IDEs and langservers can also mark the function call as deprecated instead of a generic "warning" or "error".

I don't think @runtime_error is a good name for this. For example, passing type=bool to argparse will not actually error at runtime; it will silently eat up the next argument passed on command line and always return True to the rest of your code. If we don't want to go with @typing_error, shortening it to just @error wouldn't be great either, because it's too likely to conflict with a symbol defined in a stub file. How about @type_checker_error?

AlexWaygood commented 2 years ago

I also like the idea of a decorator. How about @always_errors?

graingert commented 2 years ago

asyncio's loop kwarg removal in 3.10 would apply here too

Akuli commented 2 years ago

And yet another use case: if foo: where foo is an Iterator. Iterators are required to have __iter__ and __next__, but usually their boolean value always evaluates to true so if foo: is pointless.

class Iterator:
    ...
    @always_error("not all iterators can be checked for emptiness with their boolean value, consider using Sequence instead")
    def __bool__(self) -> bool: ...

Same goes to Iterable, as it makes no guarantees about boolean value. Arguably Iterator[Foo] | None would be a bit of a false positive, but I think it's quite rare compared to wrongly annotating non-None values as Iterable.

JelleZijlstra commented 2 years ago

I will start preparing a PEP to propose the @typing_error() decorator (as well as typing support for deprecations). If you're interested in giving early feedback, let me know.

not-my-profile commented 1 year ago

Firstly a bit of bikeshedding: I think @type_error makes more sense than @typing_error. A type error happens when you misuse a type, whereas I would understand "typing error" to refer to erroneous type hints.

Secondly: I think it would be nice to additionally have a @potential_type_error for cases where a Literal type is required but we cannot enforce it in the type stubs because the exact value may be unknown for example:

def open_db(mode: str):
    return open('example.db', mode) # potential type error: cannot statically determine that the mode is valid

It would be great if we could annotate overloads such as this one as follows:

# Fallback if mode is not specified
@overload
@potential_type_error('cannot statically determine that the mode is valid')
def open(
    file: _OpenFile,
    mode: str,
    buffering: int = ...,
    encoding: str | None = ...,
    errors: str | None = ...,
    newline: str | None = ...,
    closefd: bool = ...,
    opener: _Opener | None = ...,
) -> IO[Any]: ...

The difference between @type_error and @potential_type_error would be that the former should always be reported by static type checkers while the latter should only be reported if the static type checker is run in strict mode.

@JelleZijlstra I'd be happy to provide early feedback for the PEP.

randolf-scholz commented 1 year ago

An Alternative

I want to propose an idea that is somewhat different to what is discussed here, both can exist somewhat in parallel. It consists of 2 changes:

① Make Never generic to allow more descriptive exceptions (Never[SystemExit], Never[ZeroDivisionError], etc.).

A return hint of `Never` is equivalent to a function always raising an exception. A generic `Never` allows us to be more specific: ```python @overload def divide(a, b: Literal[0]) -> Never[ZeroDivisionError]: ... @overload def divide(a: int, b: int) -> float: ... ```

② Change Type-Checkers behavior to raise an error whenever an instance of generic `Never` is created outside an appropriate `try`-`except` block.

```python @overload def divide(a, b: Literal[0]) -> Never[ZeroDivisionError]:... @overload def divide(a: int, b: int) -> float: ... def divide(a, b) : return a/b try: divide(5, 0) # ✔ raises a ZeroDivisionError but is wrapped in a correct try-except block except ZeroDivisionError: ... try: divide(5, 0) # ✘ ValueError does not cover ZeroDivisionError except ValueError: ... divide(5, 0) # ✘ raises ZeroDivisionError ```

Advantages compared to `@typing_error`

Compared to the proposed @typing_error decorator, this would not flag when illegal function calls are wrapped in appropriate try-expect block. Even in a static context, this may be useful, for example when writing a unit test that checks if the appropriate error is raised.

Example with pytest...
```python import pytest with pytest.raises(ZeroDivisionError): divide(5, 0) # <- static type checker can safely ignore this one. ```
It is more correct from a type-theory standpoint. According to the documentation:

The bottom type, a type that has no members.

That is, as Wikipedia states, in a sound type-system, the bottom type in uninhabited, meaning that no instances of Never can exist. However, calling a function of type Callable[..., X] always returns an instance of X, and so Callable[..., Never] would create an instance of Never, hence a type-error should be raises when such a function is called.

This matters since instances of Never might be created implicitly in other ways, for example by exhausting a Union:

Example of exhausting a union.
```python def foo(x: int | tuple[int, ...]) -> str: if isinstance(x, int): return "an integer" if isinstance(x, tuple): return "a tuple" reveal_type(x) # bug: mypy 1.4.1 doesn't even reveal the type. y: int = hash(x) # <- under new paradigm this should raise an error # either [unreachable], or it should complain that `Never` as an argument # is equivalent to passing no argument and hash() yields TypeError ```

Disadvantages compared to `@typing_error`

The main disadvantage is that the proposed @typing_error-decorator can be used to pass specific messages directly to the type checker (see https://github.com/python/typing/issues/1043#issuecomment-1020533096). However, there is no reason why not both can be used simultaneously. See below for an example that combines both ideas.

Example combining both typing_error and Never[ZeroDivisionError]...

```python @overload @typing_error("Division by 0 is undefined") def divide(a, b: Literal[0]) -> Never[ZeroDivisionError]: ... @overload def divide(a: int, b: int) -> float: ... def divide(a, b): return a / b try: divide(5, 0) # no error raises except ZeroDivisionError: ... try: divide(5, 0) # type-checker should give 3 messages # 1. [uncaught-exception]: divide raises ZeroDivisionError # 2. Note: "Division by 0 is undefined" # 2. Note: ValueError does not cover ZeroDivisionError except ValueError: ... divide(5, 0) # type checker should give 2 messages: # 1. [uncaught-exception]: divide raises ZeroDivisionError # 2. Note: "Division by 0 is undefined" ```

TeamSpen210 commented 1 year ago

Calling functions that return Never/NoReturn is entirely valid, it's just that you should expect that execution no longer continues after the function. For example sys.exit() is NoReturn, and any other function that unconditionally raises is too. So is something that contains an infinite loop (say an event loop), that would never end. What would the generic type be in that case? There's no exception raised at all.

randolf-scholz commented 1 year ago

@TeamSpen210 For a genuine infinite loop, I'd imagine simply non-generic Never, which could be identified with BaseException or Never[KeyboardInterrrupt | SystemExit], as the loop will only terminate on KeyboardInterrupt, SystemExit and the likes. While the type-checker flagging calls to such an event loop might be annoying at first, they could also act as a reminder to wrap the event loop in a try-except or at least try-finally block for cleanup.

TeamSpen210 commented 1 year ago

My point is that such calls are entirely valid code, where you don't necessarily want to even catch the exception at all. Type checkers flagging the calls would simply be wrong. When you use sys.exit(), you don't want to be catching the exception, that defeats the entire point. Actually, thinking about it, your proposal is just checked exceptions? The same reasoning would make it useful to apply the same behaviour to say open(), to ensure you have a try-except for FileNotFoundError. It's not related to the return type.

randolf-scholz commented 1 year ago

@TeamSpen210 You're right, it would make more sense if only generic -Never would be flagged.

python / typing