PyCQA / pydocstyle

docstring style checker
http://pydocstyle.org
MIT License
1.11k stars 188 forks source link

Google Docstrings are (Mostly) Broken #514

Open jroose-jv opened 3 years ago

jroose-jv commented 3 years ago

The existing check for determining whether to use numpy or google docstrings looks like this:

        found_numpy = yield from self._check_numpy_sections(lines, definition, docstring)
        if not found_numpy:
            yield from self._check_google_sections(lines, definition, docstring)
        else:
            print("Found numpy docstrings")
...
    def _check_numpy_sections(self, lines, definition, docstring):
        found_any_numpy_section = False
        for ctx in self._get_section_contexts(lines,
                                              self.NUMPY_SECTION_NAMES):
            found_any_numpy_section = True
            yield from self._check_numpy_section(docstring, definition, ctx)

        return found_any_numpy_section

Which appears to pass if any numpy-style section names are found. However, there's overlap between numpy section names and google section names: (e.g. Returns, Yields, Attributes, Methods, References, etc.). So if a Google docstring uses any section names that are shared with numpy, then pydocstyle will default to checking the function only for one or the other.

I suggest you enable a configuration option and/or command line option that selects between numpy and google style docstrings (possibly needing to be separate from convention to allow incompatible config items like ignore) instead of relying upon inaccurate heuristics. Such a configuration item would cause pydocstyle to override the default numpy docstring checking behavior in favor of Google docstrings.

sigmavirus24 commented 3 years ago

Have you read the docs?

    --convention=<name>
                        choose the basic list of checked errors by specifying
                        an existing convention. Possible conventions: pep257,
                        numpy, google.
jroose commented 3 years ago

Sure have. This snippet from the docs talks a little bit about the issue.

It makes no sense to check the same docstring for both numpy and google conventions. Therefore, if we successfully detect that a docstring is in the numpy style, we don’t check it for google.

The reason numpy style takes precedence over google is that the heuristics of detecting it are better, and we don’t want to enforce users to provide external hints to pydocstyle in order to let it know which style docstrings are written in.

jroose commented 3 years ago

I've also read and tested the code enough to recognize that the docs mean this quite literally:

choose the basic list of checked errors by specifying an existing convention.

Please correct me if I've missed something, but as far as I can tell, all that option does is disable some specific PEP-257 style checks.

jroose commented 3 years ago

You can use this code to see what I mean: https://github.com/jroose/pydocstyle/tree/pydocstyle-514 It adds in several print statements to check whether _check_google_sections() or _check_numpy_sections() is being run.

The file test_google_missing_arg.py is missing documentation for one of the parameters of its first function (function_with_types_in_docstring). That ought to trigger a D417, but it doesn't. However, if you change "Returns" to "Return" in the same docstring, it does trigger a D417 because it now detects the docstring as a "google-style" docstring and parses the arguments appropriately. This happens because the pydocstyle argument names don't include "Return" as a valid argument name for numpy-style docstrings.

(venv) jroose@c4a-1:~/src/pydocstyle/src$ python3 -m pydocstyle --convention google ../test_google_missing_arg.py
Checking docstring for definition: at module level
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public function `function_with_types_in_docstring`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public function `function_with_pep484_type_annotations`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public function `module_level_function`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public function `example_generator`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public class `ExampleError`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public class `ExampleClass`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public method `__init__`
Running: _check_numpy_sections()
Running: _check_google_sections()

Checking docstring for definition: in public method `readwrite_property`
Running: _check_numpy_sections()
Running: _check_google_sections()

Checking docstring for definition: in public method `example_method`
Running: _check_numpy_sections()
found numpy

Checking docstring for definition: in public method `__special__`
Running: _check_numpy_sections()
Running: _check_google_sections()

Checking docstring for definition: in private method `_private`
Running: _check_numpy_sections()
Running: _check_google_sections()

../test_google_missing_arg.py:79 in public function `module_level_function`:
        D301: Use r""" if any backslashes in a docstring
../test_google_missing_arg.py:175 in public method `__init__`:
        D107: Missing docstring in __init__
../test_google_missing_arg.py:233 in public method `readwrite_property`:
        D205: 1 blank line required between summary line and description (found 0)
../test_google_missing_arg.py:233 in public method `readwrite_property`:
        D415: First line should end with a period, question mark, or exclamation point (not 'r')
../test_google_missing_arg.py:277 in public method `__special_without_docstring__`:
        D105: Missing docstring in magic method
samj1912 commented 3 years ago

I am tracking the larger work required for this to work with https://github.com/PyCQA/pydocstyle/issues/482#issuecomment-679403107 and https://github.com/PyCQA/pydocstyle/milestone/6

HacKanCuBa commented 2 years ago

As stated in #527, this can be circumvented by adding a Raises or Returns section, if at all possible. I'm now facing this issue again because I was using a Returns section for a function that returns None, and now I started using darglint that complains about that with DAR202 Excess "Returns" in Docstring xD

Why do we keep fighting against linters? :thinking: