Feature Request: Support for Numpy, Google, and rst Docstrings

adam-grant-hendry commented 2 years ago

Would it be possible to add support for docstring styles? e.g. Numpy, Google, and rst?

DanielNoord commented 2 years ago

Yes! I'd very much like to support this but since I don't use these myself I'm not sure what is actually breaking right now.

If you could provide me with some sample docstrings and what you like to see I can certainly work on this!

adam-grant-hendry commented 2 years ago

@DanielNoord Thank you for your support! Thank you so much for this wonderful package!

Don't get me wrong: nothing is breaking. Users must choose which format they wish to support.

The VSCode autoDocstring extension is written in TypeScript, which I don't know very well so I'm not able to parse its syntax. The closest alternative I've seen of porting this to python is python-autodocstring, but it is still very much in its infancy.

I believe these could be implemented using jinja2 templating, but there may be other ways. Not sure if you have to parse the AST or use a library like libCST. I've not experimented with those before, but I know the python-qt-tools repos have used libCST to automate adding type hinting for PyQt/PySide modules.

Currently, there are ~~three~~ two limitations of autoDocstring that I would LOVE to have in pydocstringformatter:

Cannot reanalyze docstrings on save. I submitted a feature request here over a year ago with still no response.
The extension cannot parse existing docstrings and determine what should be added or removed (it can only read the function/class signatures and add boilerplate strings). For example, pydocstyle will error if there is a signature mismatch between your function parameters and what's listed in your docstring. If pydocstringformatter could auto-detect the mismatch and then add this missing argument with a boilerplate string param (type): _enter description here_ and remove non-existing parameters, this would be HUGE!

~~To my knowledge, a VSCode extension cannot (easily?) be used with pre-commit, and I would LOVE to use pydocstringformatter with pre-commit~~

This last one pydocstringformatter already solves.

DanielNoord commented 2 years ago

The closest alternative I've seen of porting this to python is python-autodocstring, but it is still very much in its infancy.

I don't really get what you mean with "porting". This package is already written in Python?

I believe these could be implemented using jinja2 templating, but there may be other ways.

We currently use the tokens to analyse the docstrings.

Currently, there are three limitations of autoDocstring that I would LOVE to have in pydocstringformatter: [...]

To my knowledge, a VSCode extension cannot (easily?) be used with pre-commit, and I would LOVE to use pydocstringformatter with pre-commit

Based on this last comment I wonder how you are using this package. pydocstringformatter is specifically intended to be used with pre-commit and there is documentation about doing so in the readme: https://github.com/DanielNoord/pydocstringformatter#pre-commit

adam-grant-hendry commented 2 years ago

I don't really get what you mean with "porting". This package is already written in Python?

I meant porting the VSCode autoDocstring extension into python, which python-autodocstring has started. I just added this here in case it was helpful to you.

We currently use the tokens to analyse the docstrings.

That's great!

Based on this last comment I wonder how you are using this package. pydocstringformatter is specifically intended to be used with pre-commit and there is documentation about doing so in the readme: https://github.com/DanielNoord/pydocstringformatter#pre-commit

You're right, that comment was a mistake. I don't know why I put that in there. I use pydocstringformatter with pre-commit.

adam-grant-hendry commented 2 years ago

@DanielNoord In short, it would be great if pydocstringformatter could support the following formats:

and with those styles, parse signatures and determine what should be added or removed, but keep existing documentation intact (unless it is referencing a parameter that does not exist, most likely because it has been renamed or removed).

Pierre-Sassoulas commented 2 years ago

and with those styles, parse signatures and determine what should be added or removed,

For this you have the docparam checker in pylint

DanielNoord commented 2 years ago

Hmm, yeah I don't think we should be the tool to enforce docstrings on every signature. That comes with a lot of additional complexity and decision-making which I'd like to defer to tools such as pylint and pydocstyle. I see this tool mostly as a "black for docstrings": styling your docstrings according to the style recommended by the relevant PEPs with little discussion about semantics and small differences.

That said if there is anything that makes the current tool incompatible with these styles we can certainly look into that!

adam-grant-hendry commented 2 years ago

and with those styles, parse signatures and determine what should be added or removed,

For this you have the docparam checker in pylint

That will check the parameters, but not format them. My request is have an auto formatter.

adam-grant-hendry commented 2 years ago

Hmm, yeah I don't think we should be the tool to enforce docstrings on every signature. That comes with a lot of additional complexity and decision-making which I'd like to defer to tools such as pylint and pydocstyle. I see this tool mostly as a "black for docstrings": styling your docstrings according to the style recommended by the relevant PEPs with little discussion about semantics and small differences.

That said if there is anything that makes the current tool incompatible with these styles we can certainly look into that!

Those tools don't auto format thought. They can only check. My request is to have pydocstringformatter format to those styles.

DanielNoord commented 2 years ago

Those tools don't auto format thought. They can only check. My request is to have pydocstringformatter format to those styles.

We could certainly format based on a --style flag. However, I don't really want to have to start parsing the ast to get the parameter names etc. I'd rather the user provide a malformed Google style do string and we make sure that the format is correct.

adam-grant-hendry commented 2 years ago

Those tools don't auto format thought. They can only check. My request is to have pydocstringformatter format to those styles.

We could certainly format based on a --style flag. However, I don't really want to have to start parsing the ast to get the parameter names etc. I'd rather the user provide a malformed Google style do string and we make sure that the format is correct.

That would be great! I think that would be plenty.

DanielNoord commented 2 years ago

With #132 merged and 0.7.0 released we know have experimental support for numpydoc. Please let us know if you find any bugs or can think of any improvements. With the experience gained from that PR adding support for other styles shouldn't be too difficult 😄

adam-grant-hendry commented 2 years ago

With #132 merged and 0.7.0 released we know have experimental support for numpydoc. Please let us know if you find any bugs or can think of any improvements. With the experience gained from that PR adding support for other styles shouldn't be too difficult 😄

@DanielNoord Thank you so much for this!

Couple questions:

How do you specify to use both numpydoc and pep257? Is it:

--style numpydoc pep257

or

--style=numpydoc --style=pep257

or something else (comma-separated list)?

(NOTE: Both the above run without error, but I'm not sure if the later syntax overwrites style to only use pep257)

pydocstringformatter seems to modify files even when -w isn't specified. I would like to test first by printing to stdout before letting pydocstringformatter modify my files. Perhaps I have incorrect settings?

Here is my .pre-commit-config.yaml section:

  - repo: https://github.com/DanielNoord/pydocstringformatter
    rev: v0.7.0
    hooks:
      - id: pydocstringformatter
        name: (pydocstringformatter) Format docstring to given style (numpy)
        language: python
        files: .*\.py$
        args: [
          "--exclude=.venv/**,stubs/**,build/**,dist/**,ci/**",
          "--style numpydoc pep257",
        ]

I then run

pre-commit run pydocstringformatter

from the command line.

Can you add a --config option to specify a configuration file to use? I know pydocstringformatter only reads pyproject.toml files, but pre-commit runs hooks in a separate environment (except those specified with repo: local, which are run on the local machine/venv/CI). Hence, I have to specify my configurations twice: once in my pyproject.toml and again in my .pre-commit-config.yaml. It becomes easy to forget to align arguments between the two files: users (myself included) change their pyproject.toml (expecting it to affect pre-commit) and get surprised when pre-commit behaves differently from their development environment setup.
Have you tested whether you need to add require_serial: true to your pre-commit-hooks.yaml?

By default, pre-commit runs all hooks sequentially in the order they are listed in the .pre-commit-config.yaml. However, since version v1.13.0, pre-commit runs each individual hook operation on all the files passed to it in parallel. The is beneficial for the majority of cases since it means each hook will execute faster, but there are some scenarios where this may present a problem:

if a hook reads from other files not passed to it (e.g. a configuration file), or
if a hook writes to other files not passed to it (e.g. a log file)

In the above cases, this may lead to a fork bomb, specifically if pydocstringformatter uses multiprocessing. This often results in some form of an insufficient memory error, causing pre-commit to crash. See pre-commit feature request that inspired this.

DanielNoord commented 2 years ago

--style=numpydoc --style=pep257

This one. We should probably add that to the documentation somewhere.

pydocstringformatter seems to modify files even when -w isn't specified. I would like to test first by printing to stdout before letting pydocstringformatter modify my files. Perhaps I have incorrect settings?

pydocstringformatter is configured to add -w when invoked through pre-commit. See: https://github.com/DanielNoord/pydocstringformatter/blob/1366acb29f8907957ee8ccf3130a135660e94f3e/.pre-commit-hooks.yaml#L3

Can you add a --config option to specify a configuration file to use? I know pydocstringformatter only reads pyproject.toml files, but pre-commit runs hooks in a separate environment (except those specified with repo: local, which are run on the local machine/venv/CI). Hence, I have to specify my configurations twice: once in my pyproject.toml and again in my .pre-commit-config.yaml. It becomes easy to forget to align arguments between the two files: users (myself included) change their pyproject.toml (expecting it to affect pre-commit) and get surprised when pre-commit behaves differently from their development environment setup.

Hm, doesn't pydocstringformatter pick up the pyproject config during a pre-commit run? I think it should and other tools do so as well. Please confirm this and if this is indeed a bug could you open a separate issue?

Have you tested whether you need to add require_serial: true to your pre-commit-hooks.yaml?

In my own projects I haven't seen a need to do this. The tool is really quite fast, comparable or quicker than black. So for now there has been no need to start using multiprocessing.

DanielNoord / pydocstringformatter

Feature Request: Support for Numpy, Google, and rst Docstrings #125