python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.59k stars 233 forks source link

Decide whether omitted return annotation means ->Any or ->None #65

Closed gvanrossum closed 9 years ago

gvanrossum commented 9 years ago

See https://github.com/JukkaL/mypy/issues/604

pludemann commented 9 years ago

Do you want responses here or in JukkaL/mypy#604 ?

My view is that a return type is required everywhere except for __init__ (because Python will throw a TypeError if anything other than None is returned). And the fact that the question of None vs Any is even raised indicates that it's not "intuitively" obvious to programmers what the right kind of default is.

gvanrossum commented 9 years ago

(Please use backticks around dunder methods since otherwise GitHub interprets them as boldface.)

Since the mypy issue is now closed let's discuss this here. The TypeError from __init__ is new in Python 3.5. There are actually plenty of other dunder methods whose return value isn't used, e.g. __setitem__, __setattr__ etc., and I wouldn't be surprised if over time those would also start raising TypeError (especially if linters start warning about this :-).

However, I am actually quite undecided about what should be done here. @JukkaL has argued (in JukkaL/mypy#604) that None is a very common return value, esp. for argument-less methods. But the rule that a method is only type-checked if it has at least one annotation is pretty awkward when combined with a default rule for the return annotation, since it would seem that

def foo(self, arg: int):
    return arg+1

would be type-checked, while

def foo(self):
    return 42

would not be type-checked.

Perhaps it would not be so bad if we made a straightforward rule that you either have to have no annotations at all, or annotate all arguments (except for the first argument, if it's an instance or class method) and the return value. That's pretty much the only rule that doesn't have awkward edge cases, even if it requires you to annotate __init__ with -> None. In the future we can try more lenient rules (and type checkers can of course implement whatever they want).

pludemann commented 9 years ago

My Python 2.7.5 gave me this error message from __init__ returning a value:

    TypeError: __init__() should return None, not 'int'

As far as I know this doesn't have any 3.5 stuff backported to it.

Anyway, I might be blind, but I didn't see a section in PEP484 that gave the rules for when type checking happens or not (in fact, it seems like a "non-goal" according to the PEP).

Although I don't really care because our type inferencer would take

    def foo(self):
        return 42

and turn it into

    def foo(self) -> int:
        return 42

It would also turn the other example into

    def foo(self, arg: int) -> int:
        return arg+1
gvanrossum commented 9 years ago

Ah, it's new in 3.5 but also in 2.7 somehow. I can't keep up. :-)

The only explicit non-goals currently mentioned in the PEP are runtime checks or changing Python to require type annotations.

The PEP is still a draft, and specifying what code should be considered by the type checker is definitely on my agenda.

While you're right that simple toy examples like you show can easily be solved by type inferencing, this is not generally true for real-world code. I'm sure you know that many developers tend to just fix what their toolchain tells them to fix. I would like to start out with a simple rule that gives developers confidence that they can start adding annotations to their code gradually, and they won't be sucked into having to correctly annotate everything, only to find they've gone down the rabbit hole and can't dig themselves out. (I've been there several times with C++ const-correctness.)

pludemann commented 9 years ago

I'm working on a workflow design for a type inferencer. The general idea is that the programmer would start with no annotations and the inferencer would create annotations for everything (even the trivial ones you described) and add them to the source code. There's more to it than that, of course, because the programmer might need to adjust the annotations (e.g., change Union[int,float,complex] to int) and rerun the inferencer; but the intent is to not get sucked into a rabbit hole and to be able to turn off inferencing wherever not wanted (by @no_type_check or by explicitly putting Any into the annotation).

gvanrossum commented 9 years ago

Sounds like we're in violent agreement. But what about -> Any vs. -> None?

pludemann commented 9 years ago

I would avoid the question of ->Any vs ->None by requiring a return type everywhere except for a few built-in exceptions such as __init__. Especially for beginners, the default behavior of Python returning None when there's no return statement causes confusion; and adding a default behavior to type annotations potentially makes the mental model more complicated.

If most type annotations are generated by automation, the slight extra verbosity won't require the programmer doing any extra work. (Assuming the type inferencer does a good job.)

gvanrossum commented 9 years ago

I want to require an explicit -> None even for __init__ and similar, because otherwise you get the following situation:

class C:
    def __init__(self, a: int):
        # Does get checked

Now remove the arg:

class C:
    def __init__(self):
        # Does not get checked, because there are no annotations at all

If we had the -> None in the first example, dropping the arg would not have stopped type checking.

(@pludemann: I understand you don't care because you'd check them anyway -- but that's non-conformant behavior that you can document separately.)

pludemann commented 9 years ago

Simple. Consistent. Good.

vlasovskikh commented 9 years ago

Jukka, Mark, and myself believe that the default should be -> Any. It feels consistent with arguments without type annotations.

gvanrossum commented 9 years ago

OK. I probably wouldn't have brought this up if there hadn't been a bug in mypy (now fixed) regarding the return type of __init__. Leaving this open to remind us to add language to the PEP spelling this out clearly.

pludemann commented 9 years ago

@vlasovskikh How is this "consistent with arguments without type annotations"? If there's no type annotation at all, I can see the argument; but if there's some annotation, then I don't think it applies.

Python's default return is None, and that seems a more reasonable default for type annotation return type; but I prefer Guido's proposal of always requiring a type.

And if there's a partial annotation of args, the default arg should be object (which can do almost nothing) rather than Any (which can do anything). That is, these are equivalent: def foo(x) -> None def foo(x: object) -> None

but if there's no signature at all for foo, then it's the same as def foo(x: Any) -> Any

vlasovskikh commented 9 years ago

@pludemann I mean consistent in a way that an argument without a type annotation has type Any.

JukkaL commented 9 years ago

I'm fine with requiring an explicit return type if there is at least one argument with an annotation, as the implicit Any can mask errors and is just a little confusing (explicit is better than implicit).

I don't like object as the default type for an argument, since it's almost always the wrong type. I'd prefer either the current Any or always requiring a type for all arguments if any has an annotation (self and cls would be exceptions, of course). I've noticed that it's easy to accidentally leave out annotations for some arguments.

pludemann commented 9 years ago

@vlasovskikh The "compatibility" rules for args and return-value are different. This will no doubt confuse people, but I don't see an alternative beyond requiring invariance for args and return types, and we don't want that.

@JukkaL I want the type checker to be find the most errors, so Any is a bad default. If the programmer wants Any, they can specify Any. If an argument is pass-through, object is fine because the function doesn't require any further attributes beyond the few that object provides (repr, hash, etc.). If the function calls something else that requires more attributes, then object is the wrong type and a different type (probably not Any) should be changed by the programmer

JukkaL commented 9 years ago

@pludemann The reason why I don't like object is that it would usually result in confusing error messages such as 'object' has no attribute foo (if an annotation is omitted by accident) unless a type checker has some elaborate special casing for this case. Also, object seems pretty arbitrary as the default. Elsewhere we tend to use Any as the fallback type. Again, I'm fine with requiring an annotation for every argument (i.e. either everything or nothing must be annotated in a single function signature).

pludemann commented 9 years ago

@JukkaL But 'object' has no attribute foo is correct.

@matthiaskramm and I independently came up with object as the default; but that's from the point-of-view of a type inferencer.

If both object and Any are confusing as defaults, then we should probably disallow any defaults (including for the return type). Unless no type annotation is given at all for a function, in which case it's the same as def foo(x: Any, y: Any, ...) -> Any.

gvanrossum commented 9 years ago

Sounds like there are slightly different objectives here. PEP 484 and mypy are specifically focused on Gradual Typing (a.k.a. the "keep Raymond Hettinger happy" program :-). However Peter/Google is interested in doing the best possible job of inferring all types (or maybe finding all type bugs, or something like that).

I feel that this explains why the different groups prefer different defaults. However in practice I doubt it will matter much -- we should recommend that users write explicit hints for all arguments and for the return type anyway (or omit them entirely). When the user writes a partially annotated function, Peter wants to make their life as painful as possible (make the defaults so that the developer is compelled to add the type hints by the error messages due to the default assumption), while Jukka wants mypy to shut up about areas of the code where the user has given no specific guidance.

I don't want to leave this entirely up to the tool, but I do think it would be reasonable if Google's type checker emitted warnings about functions with incomplete annotations while mypy would assume a default of Any.

In the end I think it's up to @markshannon to decide, but my money is on gradual typing, with defaults to Any.

JimJJewett commented 9 years ago

Why must type hints be all or nothing, even at the function level?

For example, if part "number" is a string, rather than a number, I may wish to annotate that -- and the very fact that such annotations are rare will cause it to stand out for readers.

def __init__(self, partno:str, name, desc="")

vs

def __init__(self, partno:str, name:str, desc:str="") -> None

which will cause me to try gliding over and ignoring the signature when I'm skimming, because it just became too noisy. If a type checker really can't infer the return type from most of my functions, then I don't really have much reason to trust its judgment anyhow -- but I don't like being told that adding the occasional hint is actually wrong.

FWIW, I'm influenced by my memories of Common Lisp. Pretty much anything could be typed, and it might help the compiler. Pretty much nothing was actually typed in practice, unless there was a reason to do it -- so just knowing that someone had bothered to provide explicit type information was often more valuable than the type information itself.

pludemann commented 9 years ago

Are you saying that if you see

  class Part:
    def __init__(self, partno:str, name, desc="")

then it should be interpreted as

  class Part:
    def __init__(self, partno:str, name:Any, desc:str="") -> NoneType

or

  class Part:
    def __init__(self, partno:str, name:object, desc:str="") -> NoneType

?

The question is more interesting if the method is something other than __init__, because __init__ is required to return None. So, let's try:

    def reset(self, partno:str, name, desc="")
    def get_name(self, partno)

Would you want these interpreted as

    def reset(self, partno:str, name:Any, desc:str="") -> Any
    def get_desc(self, partno:Any) -> Any

?

If not, how (other than using a type inferencer on the source code) would you get appropriate "defaults"?

JimJJewett commented 9 years ago

Actually, for SHOULD, my interpretation of:

class Part:
    def __init__(self, partno:str, name, desc="")

is only that (1) partno is a string. (2) The function definition is not invalid.

A type checker is free to complain that the definition is incomplete, but nothing stronger than a local style guide should suggest that the incompleteness is actually wrong.

A better type checker MAY make additional inferences (with varying degrees of confidence), such as that desc should be a string, or even that name should be a string based on how it is actually used in the function. But failing to do so is not a spec violation; it is simply an inferior quality of implementation.

gvanrossum commented 9 years ago

I think we're scraping the bottom of the barrel here. Assuming there are any annotations at all the default for un-annotated arguments or return value should be Any. Type checkers may try to point out bugs but they will have to be very conservative to avoid false positives. IMO false positives are a bigger problem than false negatives -- if the type checker is wrong too often users get annoyed and simply turn it off. If it misses some bugs, well, type checking can't replace testing (see Gary Bernhardt's keynote at PyCon).

pludemann commented 9 years ago

JimJJewett -- I think you missed my point.

The type checker or inferencer must interpret a missing annotation (it can also complain that an annotation is missing, of course). The possibilities seem to be:

  1. Any (most permissive)
  2. object (least permissive)
  3. None (for return type only) (even less permissive than object) Or are you suggesting some fourth possibility?

There are good reasons for all of these, so I would prefer that the type annotations be explicit. If they're not explicit, I don't like the interpretation being "implementation defined" because interpreting an annotation would then require knowing what tool it was intended to be used with.

[Note: if an annotation is missing entirely, that's different. I'm only talking about partial annotations.]

If people don't want explicit types everywhere, then the consensus (with my dissent) seems to be that the default interpretation should be Any. Is that also what people want for the return type? (Perhaps return type should default should be None because it's so easy to return None by mistake in a program and returning Any doesn't seem terribly useful; and also because a bare return in Python means return None.)

I agree with Guido's concerns about false positives. That's why we're trying some experiments to see what seems best. Unfortunately, it'll be a while (probably months) before our tools will be good enough for running some experiments.

JimJJewett commented 9 years ago

The 4th possibility is "unspecified".

It differs from "Any" only in that a checker is allowed (and may be able) to infer something on its own.

To me, the key point is that if someone writes:

def reset(self, partno:str, name, desc="")

then neither:

def reset(self, partno:str, name:Any, desc:str="") -> Any

nor:

def reset(self, partno:str, name:Any, desc:Any="") -> Any

is any real improvement, and they are worse for a human reader. So I don't want a language standard saying that the first line is invalid and pushing people towards one of the noisier spellings. I would feel less strongly if that all-or-nothing requirement was at least explicitly limited to stub/interface files.

gvanrossum commented 9 years ago

Where does the PEP say that the first line is invalid?

JimJJewett commented 9 years ago

Where does the PEP say that the first line is invalid?

Currently lines 105-109: """ A checked function should have annotations for all its arguments and its return type, with the exception that the self argument of a method should not be annotated; it is assumed to have the type of the containing class. Notably, the return type of __init__ should be annotated with -> None. """

gvanrossum commented 9 years ago

Thanks, I've updated that to merely recommend this.

gvanrossum commented 9 years ago

Also, I wish to close this issue but I currently can't find where the PEP explicitly states that the default annotation is Any.

pludemann commented 9 years ago

So, if the return type isn't specified, it means Any? I slightly prefer the return type defaulting to None (just as return without an expression is equivalent to return None).

The paragraph still doesn't say that unannotated arguments are treated as Any. If we don't require annotation of all arguments and the return type, then we should say what the defaults are because the checker or inferencer needs to interpret them somehow (and I don't want to introduce a Unspecified type ... Any should be enough). A type checker or inferencer is, of course, free to try to figure out a more specific type than Any.

gvanrossum commented 9 years ago

Correct, an omitted return type means we're not type checking that, and we're not claiming anything about the return type. I'm updating the PEP (in this repo) to say

It is recommended but not required that checked function have
annotations for all its arguments and its return type.  For a checked
function, the default annotation for arguments and for the return type
is ``Any``.  An exception is that the first argument of instance and
class methods should not be annotated; it is assumed to have the type
of the containing class for instance method, and ``type`` for class
methods.  Note that the return type of ``__init__`` ought to be
annotated with ``-> None`` (there is no exception for ``__init__``).