zheller / flake8-quotes

Flake8 extension for checking quotes in python
MIT License
176 stars 38 forks source link

Support variable docstrings #100

Open plinss opened 4 years ago

plinss commented 4 years ago

While not official, many documentation generators, e.g. Sphinx, pdoc, epydoc, PyCharm, etc., recognize docstrings after variables.

flake8-quotes however classifies these as multi-line strings, so when using single quotes for multi-line strings and double quotes for docstrings, I get a bunch of false positives.

Example:

class Foo:
    """Class docsting."""

    def __init__(self):
        """Constructor docstring."""
        self.var = '''
            multiline string
        '''
        """var docstring."""
twolfson commented 4 years ago

Can you link us to the documentation for these? I tried to look up Sphinx's to understand its use case more but couldn't find anything =/

Currently don't understand purpose of a variable docstring, that seems like just a code comment as documentation is meant to be an external reference which is parameters and return values, variables are internal ._.

Also potential edge case for us to worry about if we do support this:

foo = \
"""hi"""
plinss commented 4 years ago

Here's pdoc's documentation: https://pdoc3.github.io/pdoc/doc/pdoc/#docstrings-for-variables

epydoc's documentation: http://epydoc.sourceforge.net/manual-docstring.html#variable-docstrings

Sphinx docs don't talk about them much but they are in the examples: https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html

The purpose is to add a description of the variable in the generated documentation (where it lists the public properties of a class, for example). It's mostly used for public module variables and object properties. I use them frequently and they are featured in pdoc's output, so they're more than just comments and are treated like real docstrings by a lot of tools. FWIW there was a PEP to add them officially but that was rejected.

At the moment, I'm using ''' for multiline strings in my code and """ for all docstrings to make them distinct. This issue is requiring me to sprinkle many # noqa Q001s all over my code.

twolfson commented 4 years ago

Cool, thanks for sharing those links and info =)

This makes a lot more sense in the context of modules and classes :+1:

It seems like another case to handle in our parser where we can anticipate a docstring after a variable is set

Going to cc @oxitnik who has handled docstring parsing to date

plinss commented 4 years ago

@twolfson I've been looking at the code to see about submitting a PR to fix this. I'm currently thinking this would be a lot easier to implement if the extension took logical lines from flake8 rather than parsing the file directly.

If I can make that work, and still pass the existing test suite, would you accept such a PR? It should also remove a bunch of code and not change the main string logic much. I expect you'll lose compatibility with pre 3.0 flake8, not sure if that's important...

(and for that matter, do you still care about it running under Python 2?)

twolfson commented 4 years ago

As of PR #94, we should only be parsing files as a backup:

https://github.com/zheller/flake8-quotes/pull/94

I'm revisiting the flake8 docs to better understand if this implies logical lines or not. It's not 100% clear to me. However, it does look like there's some abstractions which weren't standardized when we first built everything so feel free to use those

https://flake8.pycqa.org/en/3.8.2/internal/checker.html#flake8.processor.FileProcessor

I guess I'd prefer to either keep code length the same or reduce it, and ideally not rewrite the whole thing unless it's absolutely necessary

If we have to drop pre-flake8@3.0 support, then we can always make it a major release. It seems like flake8@3.0 was released in 2016 so that's plenty of time people have had to upgrade

https://flake8.pycqa.org/en/latest/release-notes/

Python 2 has been sunset so nope, we don't care about supporting it anymore

plinss commented 4 years ago

Ah, missed that, thanks. I'll check if those are physical or logical, I think they're physical.

In the approach I was thinking about, flake8 calls the extension once per logical line, I'm new to flake8 extensions so I'm not sure if you can get it to pass the whole file at once as a list of logical lines. But I'll dig around a bit. (In this approach, logical_line is the first argument.)

I don't think it'll be a complete rewrite, I think the main error detection logic will remain the same, the docstring detection logic should be able to be dropped entirely, as all docstrings will be a logical line starting with a string (I think), including the variable ones, so we get that for free. We can also drop the parsing and tokenization, as well as the Python2 token support. So essentially flake8 does all the heavy lifting and provides a list of tokens per logical line. We just look for the string tokens and ignore the rest. We should also be able to drop the noqa support as flake8 does that filtering for us as well (and it passes in a noqa flag if you need it).

I'll take a look at #82 while I'm there too, the logical lines will group continuation strings making that check easy, just consecutive string tokens. It'll probably be tomorrow tho, getting late now.

twolfson commented 4 years ago

Most work has been done in #101 but has been halted for now due to lack of time. If anyone decides to pick this up again, that's probably a good place to start

skirpichev commented 1 year ago

Perhaps, this plugin should use AST instead. Then get_docstring_tokens() helper will be trivial.