python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.56k stars 2.84k forks source link

Using sphinx documentation annotations for type checking in py2 mode #1015

Closed raph-amiard closed 8 years ago

raph-amiard commented 9 years ago

So I've been able to gather that mypy has a python 2 mode, triggered via --py2, and that the way to annotate types in this mode was via comments.

I've been using pycharm on one side, and jedi for completion in vim on the other side for quite a while. Both of those tools can use sphinx doc annotations to infer types in python 2 mode.

I was wondering if that had been considered for mypy ? It would be tremendously useful for us for those programs that we can't migrate to python 3.

Note that if this is considered a good idea but has not been done because nobody had time/desire to do it, I'm completely ready to get my hand dirty !

Regards,

gvanrossum commented 9 years ago

I'm reluctant to add this -- it was proposed and considered in the discussion leading to PEP 484 but it has too many drawbacks (including, it's very verbose).

If you really want this it would be nice to start a discussion in the PEP 484 tracker (https://github.com/ambv/typehinting) or on python-ideas about whether we should standardize one or multiple syntaxes for type hints in Python 2 code, and if so, which.

Another problem with using sphinx markup in docstrings is that they are a bit hard to translate back to the PEP 484 style annotations (once you're leaving behind Python 2), while the current comment-based syntax maps pretty directly to PEP 484 (and I intend to write a converter).

If you want to get your hands dirty in mypy it might be better to focus on tools to make working with the currently supported syntax better -- e.g. we need a tool to add a first draft of type annotations to all functions, to bootstrap type-checking code that isn't annotated. (I have several millions of lines of such code that I want to analyze, and it's not using the Sphinx convention, and the hack I currently have isn't good enough. Maybe the mypy parser can be reused for this purpose? Otherwise the lib2to3 framework should be able to handle this easily.)

We also need better stubs for the stdlib. Many of the current stub files only define 1-2 functions out of dozens actually defined by the actual stdlib module. And I would really like need a stub generator that parses the modules instead of importing them. (Again, maybe mypy's parser or lib2to3 can be used here.)

raph-amiard commented 9 years ago

I'm reluctant to add this -- it was proposed and considered in the discussion leading to PEP 484 but it has too many drawbacks (including, it's very verbose).

I agree that it is very verbose, and honestly, starting from a clean codebase, I probably would not choose this way of adding type annotations.

The point in my opinion is more that this has been the standard (by status quo) way of annotating types for a long time in a lot of tools, and that supporting it would allow a lot of people that standardized this practice internally (which is the case on a lot of projects at my company) to transition more smoothly to gradual static typing.

Another thing to keep in mind in my opinion, is that while it is verbose, it is also documentation. For a lot of code (libraries come to mind), this will be done anyway, so reusing documentation, in the context of python 2, makes more sense than maintaining two sets of type hints information. In the context of Python 3 of course this is irrelevant because doc generators can extract type information from the AST directly.

If you really want this it would be nice to start a discussion in the PEP 484 tracker (https://github.com/ambv/typehinting) or on python-ideas about whether we should standardize one or multiple syntaxes for type hints in Python 2 code, and if so, which.

Ok, will do !

Another problem with using sphinx markup in docstrings is that they are a bit hard to translate back to the PEP 484 style annotations (once you're leaving behind Python 2), while the current comment-based syntax maps pretty directly to PEP 484 (and I intend to write a converter).

I looked at mypy's code, and I see that indeed python 2 support is allowed to be a minimal modification to the parser by the current annotation syntax choices, and that it would complexify the parser code a bit to support multiple different syntaxes.

I don't know if it would make sense to:

  1. Give some standard hooks that will try to extract type hints from comments/doc strings, defering the parsing work to some kind of plug-in
  2. Allowing people to maintain out-of-tree plug-ins that would return the type annotations in a standard format.

That way we can avoid standardizing several syntaxes (which seems a bit heavy) but still have the benefits of allowing reuse of existing annotations for people who need them. Tell me what you think !

If you want to get your hands dirty in mypy ...

Well admittedly, I'm completely interested into getting my hands dirty, but I will readily admit that my motives are selfish: If I can make use of mypy in some of my projects (I'm trying the comment syntax on a new project already), I will have much more incentive to start contributing.

At my company we have standardized the use of sphinx annotations, and a lot of developers already use Jedi/Pycharm. Moving to a new syntax won't be done easily, and that's why I opened this issue.

If I can make use of mypy (which might happen with or without support for docstrings type annotations), I will definitely try and contribute !

Anyway thank you very much for your timely answer !

Regards,

raph-amiard commented 9 years ago

Hey again !

I just saw that mypy in effect already have docstring parsing for type hints, using docstring.py, and a custom format that I don't recognize. It means that in effect, much of the infrastructure for this already exists, and the format is not much less verbose than epydoc or sphinx. Why in that case not try to parse them in the common doc generators format ?

gvanrossum commented 9 years ago

It's really all a matter of priorities. We don't need this at Dropbox, and we do need a lot of other things, so adding sphinx docstring support is a distraction. Jukka can decide what he wants to do and you can submit a PR, but there's no guarantee that we'll be able to review and merge it in a timely manner.

On Sat, Nov 28, 2015 at 8:22 AM, Raphaël AMIARD notifications@github.com wrote:

Hey again !

I just saw that mypy in effect already have docstring parsing for type hints, using docstring.py, and a custom format that I don't recognize. It means that in effect, much of the infrastructure for this already exists, and the format is not much less verbose than epydoc or sphinx. Why in that case not try to parse them in the common doc generators format ?

— Reply to this email directly or view it on GitHub https://github.com/JukkaL/mypy/issues/1015#issuecomment-160315866.

--Guido van Rossum (python.org/~guido)

raph-amiard commented 9 years ago

Ok fair enough ! I'll write a patch and submit a PR, because it's fun, and I'll try to make it so awesome and readable that it's a pleasure to review and gets more chances to be integrated.

Thanks for your time, much appreciated :)

tony commented 8 years ago

Even Pycharm works with it (https://www.jetbrains.com/pycharm/help/using-docstrings-to-specify-types.html)

PEP257 (2001), so around for almost 15 years now has advised against a solution similar for type hints in 2.7:

"""
The one-line docstring should NOT be a "signature" reiterating the 
function/method parameters (which can be obtained by introspection).
Don't do:
"""

def function(a, b):
    """function(a, b) -> list"""

Type hints python 2.7, PEP484 (2014) says:

def embezzle(self, account, funds=1000000, *fake_receipts):
    # type: (str, int, *str) -> None
    """Embezzle funds from account using fake receipts."""
    <code goes here>

In retrospect, I can't help but think the advice given in python 257's advice lead the community to the more verbose style we use in autodoc, numpy and google style.

I think the annotation serves to put a lot of open source projects that bought into autodoc-style docstrings into a funky situation. Its only for python 2.7 projects - yet its incompatible with advice many projects accommodated themselves to.

There is another thing to consider. If we were to have mypy supporting autodoc and numpy style type annotations - would that inhibit python 3 migrations?

@raph-amiard Any update on your PR? I'm happy to team up on it.

gvanrossum commented 8 years ago

@tony -- I can't tell whether you're for or against adding support for the docstring style to mypy. FWIW personally I am against it (more so now than when I wrote the earlier response on this issue).

tony commented 8 years ago

@gvanrossum I'm interested in hearing if you have anything to add or any other complications that others may not be considering. Why more against than earlier?

Another problem with using sphinx markup in docstrings is that they are a bit hard to translate back to the PEP 484 style annotations (once you're leaving behind Python 2)

I notice one area that could pose a potential problem with autodoc, take something like https://github.com/tony/tmuxp/blob/475ed94/tmuxp/window.py#L75 or https://github.com/tony/tmuxp/blob/475ed94/tmuxp/window.py#L166.

That could get screwy trying to parse. Anything else where autodoc where it'd be tricky? Or is that what you mean?

I want to weigh the cost-benefits of it. I'm more determined to get mypy in any way (even if it superficially feels redundant at first) than I am to trying jerry–rig autodoc to work with mypy.

gvanrossum commented 8 years ago

Two things.

First, I think the language should lead, and the tools will follow. This doesn't mean the language should intentionally try to fight the tools. It just means that the language should point the tools in the direction it wants to go and not the other way around.

In this case, the direction in which the language is pointing is the direction of the Python 3 syntax from PEP 484 (i.e. inline annotations using PEP 3107). Eventually (probably sooner rather than later) the tools will support that syntax and combine it with the human-readable descriptions from docstrings if present. The Python 2 variant from PEP 484 is a temporary solution designed to be as close to inline annotations as possible given that Python 2 doesn't support PEP 3107. It is not hard to adapt, especially for a tool that wants to support the Python 3 syntax.

Second, the docstring convention is often followed only approximately. This is no big deal for the original purpose, generating documentation: if there are typos in a docstring the human reader can easily recover the intention. I suspect that in many cases no tooling is used to generate documentation from docstrings -- programmers simply read the source code and the markup is light enough that they can follow along. (The "napoleon" convention is particularly attractive when used this way.)

But if mypy were to be pointed at a body of code using type annotations in docstrings (if it could read them) it would overwhelm the user with a barrage of complaints due to discrepancies between the information in the docstrings and the actual code. This is not a good first experience for a user interested in starting with mypy -- even if the blame lies purely with incorrect information in the docstring, it will be hard for the user to figure out what to do, and a barrage of warnings that must be ignored trains the user in the wrong attitude. Compare this to the current situation -- the user is encouraged to add annotations piecemeal, to one class or module at a time, and mypy will only type-check those parts of the program to which annotations are explicitly added. This process can be controlled by the user and the overall experience will be much more favorable.

(Note that the codebases that stand most to benefit from type hints are very large ones. For these it is particularly important not to overwhelm the user with spurious warnings, since trying to address them all at once will not be possible.)

Perhaps a more useful thing to contribute would be a separate tool that extracts information from docstrings and converts it into PEP 484 conforming type annotations. Such a tool could be used to get the annotations started. A possible starting point might be the existing annotation generator (https://github.com/python/mypy/blob/master/misc/fix_annotate.py, to be run as a 2to3 fixer).

tony commented 8 years ago

Understood.

I agree with both your points and see it your way.

I think that concludes this issue. What do you think @JukkaL @raph-amiard @gvanrossum?

gvanrossum commented 8 years ago

Let's close it.

JukkaL commented 8 years ago

I agree that we should give up on parsing docstrings in mypy. There are many issues with them, the most obvious being what Guido pointed out above: they are usually too inconsistent in existing code to be very useful for static type checking. We need a well-defined syntax for type annotations that doesn't conflict with existing code and that is also easy to automatically migrate to Python 3 style annotations.

bendtherules commented 7 years ago

I understand this thread is old, but just wanted to bounce off a idea, about whether it has been tried.

Is there any existing solution to not explicitly provide the type information in docstring if you already have it in the comment string or annotation format?

For ex. if you have

def func(abc : int ) -> int :

'''
Arguments:
        abc {[type]} -- Some description
'''

type value in the docstring would be inferred from the mypy annotation?

The overall theory is that annotation already has some technical information about the signature, which is also good to have in the documentation. So, documentation would somehow infer the type from the annotation without duplication.

gvanrossum commented 7 years ago

Soon yo may be able to write your own plugin to extract type annotations from docstrings. See #3517 and #3225. Mypy itself is not going to support this however, it will have to be a user-supported plugin.

chadrik commented 7 years ago

@bendtherules I'm working on adding the docstring hook that @gvanrossum mentioned for extracting annotations from docstrings. Your request is for something that goes the opposite direction: making docstrings from annotations. I think what you're after is this PR in sphinx: https://github.com/sphinx-doc/sphinx/pull/1975

bendtherules commented 7 years ago

@gvanrossum Thanks for information on the ongoing work. It seems fine in my opinion, that mypy shouldnt support it directly, considering the scope of the project and the huge number of documentation formats out there. @chadrik Good to know that Sphinx is going to support this. What I understand from this explanation is those annotations are currently parsed by Sphinx and they do show up in the docs like this and only for Python 3 type annotation. Sphinx doc What I was expecting more was that (from the above image) both functions would show up in the same way regardless of how I write the type information and preferably in the below older format. And python 2 support also 😃 So, with the hooks you are working on, will Python 2 style annotation support be there? This hooks will definitely ease the load and give more reliable solution to the documentation projects.

gvanrossum commented 7 years ago

I really think these are questions for the Sphinx project, not for mypy.