Closed gvanrossum closed 9 years ago
Do you want responses here or in JukkaL/mypy#604 ?
My view is that a return type is required everywhere except for __init__
(because Python will throw a TypeError if anything other than None is returned).
And the fact that the question of None vs Any is even raised indicates that it's not "intuitively" obvious to programmers what the right kind of default is.
(Please use backticks around dunder methods since otherwise GitHub interprets them as boldface.)
Since the mypy issue is now closed let's discuss this here. The TypeError from __init__
is new in Python 3.5. There are actually plenty of other dunder methods whose return value isn't used, e.g. __setitem__
, __setattr__
etc., and I wouldn't be surprised if over time those would also start raising TypeError (especially if linters start warning about this :-).
However, I am actually quite undecided about what should be done here. @JukkaL has argued (in JukkaL/mypy#604) that None is a very common return value, esp. for argument-less methods. But the rule that a method is only type-checked if it has at least one annotation is pretty awkward when combined with a default rule for the return annotation, since it would seem that
def foo(self, arg: int):
return arg+1
would be type-checked, while
def foo(self):
return 42
would not be type-checked.
Perhaps it would not be so bad if we made a straightforward rule that you either have to have no annotations at all, or annotate all arguments (except for the first argument, if it's an instance or class method) and the return value. That's pretty much the only rule that doesn't have awkward edge cases, even if it requires you to annotate __init__
with -> None
. In the future we can try more lenient rules (and type checkers can of course implement whatever they want).
My Python 2.7.5 gave me this error message from __init__
returning a value:
TypeError: __init__() should return None, not 'int'
As far as I know this doesn't have any 3.5 stuff backported to it.
Anyway, I might be blind, but I didn't see a section in PEP484 that gave the rules for when type checking happens or not (in fact, it seems like a "non-goal" according to the PEP).
Although I don't really care because our type inferencer would take
def foo(self):
return 42
and turn it into
def foo(self) -> int:
return 42
It would also turn the other example into
def foo(self, arg: int) -> int:
return arg+1
Ah, it's new in 3.5 but also in 2.7 somehow. I can't keep up. :-)
The only explicit non-goals currently mentioned in the PEP are runtime checks or changing Python to require type annotations.
The PEP is still a draft, and specifying what code should be considered by the type checker is definitely on my agenda.
While you're right that simple toy examples like you show can easily be solved by type inferencing, this is not generally true for real-world code. I'm sure you know that many developers tend to just fix what their toolchain tells them to fix. I would like to start out with a simple rule that gives developers confidence that they can start adding annotations to their code gradually, and they won't be sucked into having to correctly annotate everything, only to find they've gone down the rabbit hole and can't dig themselves out. (I've been there several times with C++ const-correctness.)
I'm working on a workflow design for a type inferencer. The general idea is that the programmer would start with no annotations and the inferencer would create annotations for everything (even the trivial ones you described) and add them to the source code. There's more to it than that, of course, because the programmer might need to adjust the annotations (e.g., change Union[int,float,complex]
to int
) and rerun the inferencer; but the intent is to not get sucked into a rabbit hole and to be able to turn off inferencing wherever not wanted (by @no_type_check
or by explicitly putting Any
into the annotation).
Sounds like we're in violent agreement. But what about -> Any
vs. -> None
?
I would avoid the question of ->Any
vs ->None
by requiring a return type everywhere except for a few built-in exceptions such as __init__
. Especially for beginners, the default behavior of Python returning None
when there's no return
statement causes confusion; and adding a default behavior to type annotations potentially makes the mental model more complicated.
If most type annotations are generated by automation, the slight extra verbosity won't require the programmer doing any extra work. (Assuming the type inferencer does a good job.)
I want to require an explicit -> None
even for __init__
and similar, because otherwise you get the following situation:
class C:
def __init__(self, a: int):
# Does get checked
Now remove the arg:
class C:
def __init__(self):
# Does not get checked, because there are no annotations at all
If we had the -> None
in the first example, dropping the arg would not have stopped type checking.
(@pludemann: I understand you don't care because you'd check them anyway -- but that's non-conformant behavior that you can document separately.)
Simple. Consistent. Good.
Jukka, Mark, and myself believe that the default should be -> Any
. It feels consistent with arguments without type annotations.
OK. I probably wouldn't have brought this up if there hadn't been a bug in mypy (now fixed) regarding the return type of __init__
. Leaving this open to remind us to add language to the PEP spelling this out clearly.
@vlasovskikh How is this "consistent with arguments without type annotations"? If there's no type annotation at all, I can see the argument; but if there's some annotation, then I don't think it applies.
Python's default return
is None
, and that seems a more reasonable default for type annotation return type; but I prefer Guido's proposal of always requiring a type.
And if there's a partial annotation of args, the default arg should be object
(which can do almost nothing) rather than Any
(which can do anything). That is, these are equivalent:
def foo(x) -> None
def foo(x: object) -> None
but if there's no signature at all for foo
, then it's the same as
def foo(x: Any) -> Any
@pludemann I mean consistent in a way that an argument without a type annotation has type Any
.
I'm fine with requiring an explicit return type if there is at least one argument with an annotation, as the implicit Any
can mask errors and is just a little confusing (explicit is better than implicit).
I don't like object
as the default type for an argument, since it's almost always the wrong type. I'd prefer either the current Any
or always requiring a type for all arguments if any has an annotation (self
and cls
would be exceptions, of course). I've noticed that it's easy to accidentally leave out annotations for some arguments.
@vlasovskikh The "compatibility" rules for args and return-value are different. This will no doubt confuse people, but I don't see an alternative beyond requiring invariance for args and return types, and we don't want that.
@JukkaL I want the type checker to be find the most errors, so Any
is a bad default. If the programmer wants Any
, they can specify Any
. If an argument is pass-through, object
is fine because the function doesn't require any further attributes beyond the few that object
provides (repr
, hash
, etc.). If the function calls something else that requires more attributes, then object
is the wrong type and a different type (probably not Any
) should be changed by the programmer
@pludemann The reason why I don't like object
is that it would usually result in confusing error messages such as 'object' has no attribute foo
(if an annotation is omitted by accident) unless a type checker has some elaborate special casing for this case. Also, object
seems pretty arbitrary as the default. Elsewhere we tend to use Any
as the fallback type. Again, I'm fine with requiring an annotation for every argument (i.e. either everything or nothing must be annotated in a single function signature).
@JukkaL But 'object' has no attribute foo
is correct.
@matthiaskramm and I independently came up with object
as the default; but that's from the point-of-view of a type inferencer.
If both object
and Any
are confusing as defaults, then we should probably disallow any defaults (including for the return type). Unless no type annotation is given at all for a function, in which case it's the same as def foo(x: Any, y: Any, ...) -> Any
.
Sounds like there are slightly different objectives here. PEP 484 and mypy are specifically focused on Gradual Typing (a.k.a. the "keep Raymond Hettinger happy" program :-). However Peter/Google is interested in doing the best possible job of inferring all types (or maybe finding all type bugs, or something like that).
I feel that this explains why the different groups prefer different defaults. However in practice I doubt it will matter much -- we should recommend that users write explicit hints for all arguments and for the return type anyway (or omit them entirely). When the user writes a partially annotated function, Peter wants to make their life as painful as possible (make the defaults so that the developer is compelled to add the type hints by the error messages due to the default assumption), while Jukka wants mypy to shut up about areas of the code where the user has given no specific guidance.
I don't want to leave this entirely up to the tool, but I do think it would be reasonable if Google's type checker emitted warnings about functions with incomplete annotations while mypy would assume a default of Any
.
In the end I think it's up to @markshannon to decide, but my money is on gradual typing, with defaults to Any
.
Why must type hints be all or nothing, even at the function level?
For example, if part "number" is a string, rather than a number, I may wish to annotate that -- and the very fact that such annotations are rare will cause it to stand out for readers.
def __init__(self, partno:str, name, desc="")
vs
def __init__(self, partno:str, name:str, desc:str="") -> None
which will cause me to try gliding over and ignoring the signature when I'm skimming, because it just became too noisy. If a type checker really can't infer the return type from most of my functions, then I don't really have much reason to trust its judgment anyhow -- but I don't like being told that adding the occasional hint is actually wrong.
FWIW, I'm influenced by my memories of Common Lisp. Pretty much anything could be typed, and it might help the compiler. Pretty much nothing was actually typed in practice, unless there was a reason to do it -- so just knowing that someone had bothered to provide explicit type information was often more valuable than the type information itself.
Are you saying that if you see
class Part:
def __init__(self, partno:str, name, desc="")
then it should be interpreted as
class Part:
def __init__(self, partno:str, name:Any, desc:str="") -> NoneType
or
class Part:
def __init__(self, partno:str, name:object, desc:str="") -> NoneType
?
The question is more interesting if the method is something other than __init__
, because __init__
is required to return None
. So, let's try:
def reset(self, partno:str, name, desc="")
def get_name(self, partno)
Would you want these interpreted as
def reset(self, partno:str, name:Any, desc:str="") -> Any
def get_desc(self, partno:Any) -> Any
?
If not, how (other than using a type inferencer on the source code) would you get appropriate "defaults"?
Actually, for SHOULD, my interpretation of:
class Part:
def __init__(self, partno:str, name, desc="")
is only that (1) partno is a string. (2) The function definition is not invalid.
A type checker is free to complain that the definition is incomplete, but nothing stronger than a local style guide should suggest that the incompleteness is actually wrong.
A better type checker MAY make additional inferences (with varying degrees of confidence), such as that desc should be a string, or even that name should be a string based on how it is actually used in the function. But failing to do so is not a spec violation; it is simply an inferior quality of implementation.
I think we're scraping the bottom of the barrel here. Assuming there are any annotations at all the default for un-annotated arguments or return value should be Any. Type checkers may try to point out bugs but they will have to be very conservative to avoid false positives. IMO false positives are a bigger problem than false negatives -- if the type checker is wrong too often users get annoyed and simply turn it off. If it misses some bugs, well, type checking can't replace testing (see Gary Bernhardt's keynote at PyCon).
JimJJewett -- I think you missed my point.
The type checker or inferencer must interpret a missing annotation (it can also complain that an annotation is missing, of course). The possibilities seem to be:
Any
(most permissive)object
(least permissive)None
(for return type only) (even less permissive than object
)
Or are you suggesting some fourth possibility?There are good reasons for all of these, so I would prefer that the type annotations be explicit. If they're not explicit, I don't like the interpretation being "implementation defined" because interpreting an annotation would then require knowing what tool it was intended to be used with.
[Note: if an annotation is missing entirely, that's different. I'm only talking about partial annotations.]
If people don't want explicit types everywhere, then the consensus (with my dissent) seems to be that the default interpretation should be Any
. Is that also what people want for the return type? (Perhaps return type should default should be None
because it's so easy to return None
by mistake in a program and returning Any
doesn't seem terribly useful; and also because a bare return
in Python means return None
.)
I agree with Guido's concerns about false positives. That's why we're trying some experiments to see what seems best. Unfortunately, it'll be a while (probably months) before our tools will be good enough for running some experiments.
The 4th possibility is "unspecified".
It differs from "Any" only in that a checker is allowed (and may be able) to infer something on its own.
To me, the key point is that if someone writes:
def reset(self, partno:str, name, desc="")
then neither:
def reset(self, partno:str, name:Any, desc:str="") -> Any
nor:
def reset(self, partno:str, name:Any, desc:Any="") -> Any
is any real improvement, and they are worse for a human reader. So I don't want a language standard saying that the first line is invalid and pushing people towards one of the noisier spellings. I would feel less strongly if that all-or-nothing requirement was at least explicitly limited to stub/interface files.
Where does the PEP say that the first line is invalid?
Where does the PEP say that the first line is invalid?
Currently lines 105-109:
"""
A checked function should have annotations for all its arguments and
its return type, with the exception that the self
argument of a
method should not be annotated; it is assumed to have the type of the
containing class. Notably, the return type of __init__
should be
annotated with -> None
.
"""
Thanks, I've updated that to merely recommend this.
Also, I wish to close this issue but I currently can't find where the PEP explicitly states that the default annotation is Any.
So, if the return type isn't specified, it means Any
? I slightly prefer the return type defaulting to None
(just as return
without an expression is equivalent to return None
).
The paragraph still doesn't say that unannotated arguments are treated as Any
. If we don't require annotation of all arguments and the return type, then we should say what the defaults are because the checker or inferencer needs to interpret them somehow (and I don't want to introduce a Unspecified
type ... Any
should be enough). A type checker or inferencer is, of course, free to try to figure out a more specific type than Any
.
Correct, an omitted return type means we're not type checking that, and we're not claiming anything about the return type. I'm updating the PEP (in this repo) to say
It is recommended but not required that checked function have
annotations for all its arguments and its return type. For a checked
function, the default annotation for arguments and for the return type
is ``Any``. An exception is that the first argument of instance and
class methods should not be annotated; it is assumed to have the type
of the containing class for instance method, and ``type`` for class
methods. Note that the return type of ``__init__`` ought to be
annotated with ``-> None`` (there is no exception for ``__init__``).
See https://github.com/JukkaL/mypy/issues/604