Closed matthiaskramm closed 9 years ago
I'm not a fan of listing exceptions functions can throw, especially here in Python, where it's easier to ask forgiveness than permission.
Of course, on the other hand, this is only for stubs, so that might not really matter.
I'm not a fan of checking exceptions either -- IIUC the idea of checked exceptions has been tried extensively in Java and there are a lot of usability problems with it. I'm not sure what use a checker could make of exception declaration in a stub -- certainly I wouldn't want it to start telling me that I'm not catching these! That said, if there is a use case, the first syntax (a bunch of raise
statements in the method body) seems better than adding a raises
method to typing
.
Some things explicitly work with exceptions -- e.g., StopIteration, in which case they're almost a special return value and it would be an error if the function can't raise the exception. Conversely, it's probably a mistake to catch an exception that can't happen (although this fits into the "mostly harmless" category of mistake; but it could be an indication that the logic of a called function has changed and the caller didn't realize).
In my past life with Java, I've had mixed feelings about exception checking. It's saved me from some mistakes more than it's been annoying. Maybe the checking of exceptions could be controlled by some notion of "unchecked exceptions"?
My main issue is the fact that Python uses exceptions so heavily. They are everywhere. A notion of unchecked exceptions is also very fragile. For instance, take this piece of code:
@raises(ValueError)
def x(f: Callable[something]):
return f(args)
def y(args):
with do_not_check_exceptions:
raise IOError()
x(y) # BOOM!!!!
At that point, exception checking becomes useless and easy to break.
Let me draw a line in the sand. The PEP will not support any form of exception checking. The only thing possibly under discussion here is whether there is some other use of stubs (maybe an IDE suggesting a raise or try/except) that might benefit from declaring exceptions. But so far everything brought up has just been about the relative advantages of checked exceptions, and that issue is closed. We won't do it.
Why would you prohibit me from doing something that I find useful? I have no problem with the default behavior being to not do exception checking, but why prohibit it?
The PEP doesn't mandate any particular behavior from a type checker, so I'm not prohibiting you from doing something you find useful. Whether it is actually useful may well depend on the codebase you are checking. I just don't want to have to put anything in the PEP that would seem to make checked exceptions part of the signature of a function. Maybe as a compromise we can just say in the PEP that a conformant type checker should not interpret the body of functions in stubs, and you can have a non-conformant option that interprets raise statements in stub function bodies.
In terms of use cases for exceptions in stubs: There are two possible paths through the below code, which a type inferencer or static analyzer can build a CFG from if it knows that dict.__getitem__
can throw KeyError.
try:
x = i
y = a[x]
except KeyError:
y = x
f(x, y)
Without having exception information for at least the Python builtins and maybe the standard library, static checkers have to approximate the CFG of the above, thus leading to inferior results.
I'd be happy with the compromise of just making the body of stub functions "free for all", though.
But if you change the example slightly you'd still have a mystery on your hand:
try:
x = foo()
y = a[x]
except KeyError:
y = x
Unless you also have a way of declaring whether foo() (which I assumed is defined by the app) may raise KeyError or not, the type checker won't know whether x is defined in the except clause. So you are still arguing for checked exceptions. Please give it up.
The way our type inferencer works, it'll just analyze foo() too, to see what it throws.
You really don't need to declare exceptions until you hit C code, when all of a sudden you do.
I never cared much for the exception checking nightmare Java imposed on its community, but I do care about correct control flow graphs. Having some notion of what builtin/stdlib types, in terms of exceptions, can or cannot do, helps with that. That's why we currently still allow them in the stubs of typeshed (https://github.com/JukkaL/typeshed/blob/master/python3/time.py)
So have you already decided on which exceptions you consider "checked" and which "unchecked"? Because pretty much everything can raise MemoryError (obviously) but also e.g. TypeError or AttributeError (if somehow an incorrept type creeps past your checker :-). Anyway, you have my blessing for multiple raise statements in stubs.
I'd like some way for the programmer to decide what's "checked" and "unchecked". But don't forget that we're inferring types, so we just show the result, and don't require catching all the exceptions like in Java.
As to where the boundary should be between "checked" and "unchecked" ... I think that where an exception is used as an extra value (e.g., I want something like NaN in floats, except for ints), then it makes sense as part of the type annotation. But that means KeyError and IndexError (and LookupError) become "checked" and that'll just add a lot of noise. However, a user-defined WeirdValueError probably should show up in the type annotation for a function.
IOError is an interesting boundary case ... it's the kind of thing that C programmers typically don't bother checking and thereby produce non-robust programs because IO problems are relatively common ... but Python at least crashes if IOErrors aren't checked for. Again, it might be useful to expose IOError in the type annotation so that someone who wants to build a robust program can see where IOError might show up and needs to be handled by something kinder than a rude IOError message from the runtime.
I have found Java's "checked" exceptions to be useful when writing robust code; and I can see why they annoy many people. But we already can see that exceptions are useful in the "stubs", and I'd like to at least allow the possibility of exceptions in non-stub type annotations so that we can see whether they are useful or not.
In a type checker, TypeError and AttributeError (and ValueError) play a special role, since in a way, they're the things you're hunting for.
As for other built-in exceptions: Grepping a large codebase (a few million lines) for "except (...)Error", and ordering the results, from less frequent to more frequent, yields roughly what you'd expect: MemoryError < AssertionError < KeyboardInterrupt < IndexError < ImportError < IOError < KeyError (selected items, full list at http://www.quiss.org/files/exceptions_by_frequency.txt) In terms of path logic, things get "interesting" roughly at ImportError.
But in typeshed, we'll still annotate all known exceptions in the builtins and leave it to type checkers to decide which ones they care about.
It's always possible to extend the annotation system in the future. But PEP 484 is already under a lot of time pressure and in order to release at least something in Python 3.5 beta 1 we need to limit the set of features. Also please read up on the meaning of provisional in PEP 411.
I think @matthiaskramm is fine with leaving this out of the PEP. We can come up with conventions on typeshed that allow (but do not require) stubs to include exception annotations (using raise FooError
, for example), but these can be ignored by tools that don't care about them, and these involve no new syntax.
So can we close this issue as won't fix?
Few comments for readers which come here at later times from search/links:
Having some notion of what builtin/stdlib types, in terms of exceptions, can or cannot do, helps with that. That's why we currently still allow them in the stubs of typeshed (https://github.com/JukkaL/typeshed/blob/master/python3/time.py)
The link above now refers to https://github.com/python/typeshed/blob/master/stdlib/2and3/time.pyi . You won't find any raise's there, or in its history browsable via github, because the file was renamed along the way. Doing so requires real git log --follow
, and leads to
https://github.com/python/typeshed/commit/337abed05a35f0c5b67c02d42bca0af1e2adc76e .
The commit by @matthiaskramm is done from @google.com email address, which should clarify what is meant by:
The way our type inferencer works ...
Indeed, he (at this time) is listed as no.2 contributor to https://github.com/google/pytype/graphs/contributors .
The nature of the changes in that commit is illustrated by:
-def gmtime(*args, **kwargs) -> tuple:
- raise OSError()
+def gmtime(secs: Union[float, None] = None) -> struct_time: ...
I.e., there were no argument annotations, but were exception annotations, and this commit added arg annotations at the expense of losing raise annotations. As of today's typeshed repo, there're no raise annotations in it.
Again, the discussion in this ticket is along the lines of "nobody against exception annotations, but neither they're a subject of currently intended typing usage". The "currently intended typing usage" is from 2015 however. It would be very said if all this work didn't lead from mere linters to real code generation optimizers. Actually, it quite naturally already does with efforts like https://github.com/mypyc/mypyc, though even that is so far keeps being disguised as being an ends to linting needs again ("Currently our focus is on making mypy faster through compilation.")
Overall, I hope people interested in efficient code generation (which requires typing annotations, including those beyond needed for mere linting tools) won't be kept pushed away from Python into other languages, like it was before.
Oh, and 2 cents from me on the actual syntax:
def f():
raise IOError()
raise EOFError()
Those parens look like noise to me (in type annotations, again), better to keep it simple:
def f():
raise IOError
raise EOFError
(That's of course valid Python code, the semantics being is that "raise" statement automatically instantiates an exception object, if it's passed an exception type.)
I also find interesting that the time.py annotations prior to the commit listed above had:
-def strftime(a: str, *args, **kwargs) -> unicode:
- raise MemoryError()
Annotating for MemoryError would seem to be useless - it's archetypal kind of unchecked exception: anything which can allocate, can throw MemoryError. And in CPython, (almost) anything can allocate indeed. But some functions/methods are still would, consider for example a function which just returns l[0]
of a non-empty list. And is some Python implementations, e.g. MicroPython, no-allocation property is important, and some functions are purposely designed to never allocate.
Marking functions which may allocate is however somehow backwards, because majority of functions do that. What's interesting is annotating those few which purposely were made no-alloc.
The same is true for exceptions too - annotating individual exceptions raised is tedious is only so much useful even for optimized (native) code generation. However, a few functions which purposely made no-throw are much more interesting to annotate, because they allow to remove some exception handling overheads when dealing with them (and removing these overheads might be the reason they were made no-throw in the first place).
Is there a corresponding discussion on the typeshed tracker? We would be interested in doing static analysis on exceptions. We can definitely declare them outside of stubs but stubs seem like the natural place to do this.
typeshed does not support features that are not officially supported by (at least candidate) PEPs, Python documentation, or some other form of official document. It's not the right place to discuss this.
Checking exceptions is very useful for public interfaces. Being unaware of what exceptions may be thrown, it's impossible to "ask for forgiveness" ... because you simply don't know what to catch. And many exceptions (ValueError) are so generic, that it's impossible to know whether the error you want to catch is being caught. Support for typing is a nice tool.
Food for thought (syntax-wise): now that PEP 593 is accepted it could be possible to use Annotated
with an alias to encode this.
Adding to what @earonesty said: Many functions in libraries (I am looking at you, requests
) can raise a whole lot of different exceptions and these are often not documented. I often find myself using the REPL to try to replicate all possible circumstances that I can come up with, just to figure out what exceptions the function can raise.
This is fine for little hacky scripts or when working on scientific code or something. But it sucks when writing production code that needs to be reliable.
Also, @gvanrossum, I can come up with a few possible scenarios that could use this information that are not mandatory type checking. For example, a type checker could implement a function where you supply the type checker with the code base and a reference to a function, and the type checker then tells you all possible exceptions that could be raised by that function. Here it does not matter if an exception in the output is useful (e.g. something like a TimeoutException) or not (like a MemoryError) because this information is only presented to the programmer, so they can filter out the nonsense from the real information.
Also, adding optional type checking to a type checker does not hurt any user who does not want it, because they can just not enable it.
Pylint provides an interesting solution for anyone that still requires the exception handling.
The missing-raises-doc is a pylint message for all no documented exception raised. It's a nice solution to catch exceptions using the linter.
@juansedo does it check for functions that are raised indirectly, by a function used inside the function?
When a function raises an exception explicitly that's already pretty self documenting.
Should methods in stubs be allowed to declare that they can throw exceptions?
When Jukka and I discussed this, we came up with the syntax ideas
and