astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
32.06k stars 1.07k forks source link

Implement rules in `darglint` #458

Open edgarrmondragon opened 2 years ago

edgarrmondragon commented 2 years ago

This might be superseded by https://github.com/astral-sh/ruff/issues/12434


https://pypi.org/project/darglint/

Darglint's primary focus is to identify incorrect and missing documentationd of a function's signature. Checking style is a stretch goal, and is supported on a best-effort basis. Darglint does not check stylistic preferences expressed by tools in the Python Code Quality Authority (through tools such as pydocstyle). So when using Darglint, it may be a good idea to also use pydocstyle, if you want to enforce style. (For example, pydocstyle requires the short summary to be separated from other sections by a line break. Darglint makes no such check.)

Error Codes

charliermarsh commented 1 year ago

Cool, I've never used but looks useful. Would this be an impactful plugin for you? Would it unblock any Ruff adoption? :)

edgarrmondragon commented 1 year ago

Hey @charliermarsh, yeah. I maintain a few packages with public APIs that would benefit from contributors being informed about documentation for arguments or exceptions raised being missing.

A number of features would make Ruff a nice replacement for flake8: speed, compatibility with the rest of the Python ecosystem, pyproject.toml support.

lsorber commented 1 year ago

FWIW, darglint is the main blocker [1] for us to migrate Poetry Cookiecutter [2] from flake8 to ruff.

[1] https://github.com/radix-ai/poetry-cookiecutter/issues/125 [2] https://github.com/radix-ai/poetry-cookiecutter

charliermarsh commented 1 year ago

👍 Helpful to hear! darglint is probably the most popular unimplemented plugin. I'd like to get around to it soon. If there's a subset of rules that are most impactful, that'd be helpful to know too for prioritization.

edgarrmondragon commented 1 year ago

If there's a subset of rules that are most impactful, that'd be helpful to know too for prioritization.

For me personally, they're probably DAR101, DAR102, DAR201, DAR202 and to a lesser extent DAR301, DAR302, DAR401 and DAR402

strickvl commented 1 year ago

For me, I use darglint together with pydocstyle since they don't do quite the same thing. Having darglint absorbed into the ruff world would be really great.

erwann-met commented 1 year ago

We just discovered ruff in my team. It's amazing, and I'd like to thank you for all the great work!

We used to run flake8 along with darglint. Unfortunately since ruff does not yet implement darglint rules we still have to keep darglint alongside ruff which is a shame given how slow darglint is.

Reimplementing darglint within ruff is definitely the feature we are most looking forward to!

For us the most important rules are probably those related to the parameters (and returns) and their types: DAR101, DAR102, DAR103, DAR201, DAR202, DAR203 and DAR501. I hope this helps!

We understand that maintaining an open-source project can be a lot of work and we want to express our appreciation for all the effort that goes into it. We were wondering if there's any update on when this issue might be addressed? Any information would be greatly appreciated. Thank you!

charliermarsh commented 1 year ago

Very helpful, thank you! It's not something I've started working on actively (and would make for a good contribution if there are any eager contributors reading this :)), but that set of rules at least is probably not too much work to support, and this issue has a lot of thumbs-up votes so I'll try to get to it soon-ish. I'll post here when I'm able to take it on.

charliermarsh commented 1 year ago

Looking into starting on this. darglint has a lot of code because it defines entire grammars for the docstrings, and then lexes and parses according to those grammars. pydocstyle, on the other hand, does it all with regular expressions. Wondering if we could get away with the latter.

anthonyburdi commented 1 year ago

Super exciting to hear that this is in the works. We just switched to ruff instead of pydocstyle and are eager to replace darglint as well. Here are the error codes we are most interested (least overlap with pydocstyle):

rbebb commented 1 year ago

It would be great if it could add darglint’s feature that checks if docstrings match a particular style (whether it be Sphinx, Google, or something else)

baggiponte commented 1 year ago

Implementing darglint could handle some flake8 exceptions, as well as part of docformatter I guess?

rbebb commented 1 year ago

@charliermarsh Is there an update on implementing darglint features in ruff? Thank you for all of your work on this project!

finswimmer commented 1 year ago

Hey,

unfortunately darglint is unmaintained since December 2022. So I would be very happy to see something similar in ruff.

Thanks a lot for all your very good work!

fin swimmer

charliermarsh commented 1 year ago

I'm currently considering taking this on as a project over the next few weeks, but it's competing with some other priorities so not a firm commitment yet :) I definitely hear that this something users want, and it's something I want Ruff to support too -- just a question of when, will post here when I have a clear decision.

leandro-lucarella-frequenz commented 1 year ago

Great to hear work is close to being started. We also rely a lot on darglint and are concerned about is current abandoned state, and were also looking to start using ruff, so it would be very good news if it can be a replacement for darglint. The rules that most often catch error in the documentation for us are:

jsh9 commented 1 year ago

Hi @charliermarsh , after I raised this issue (https://github.com/charliermarsh/ruff/issues/4362), I looked into Darglint myself. It was extremely slow, and it can actually be unusable for large projects.

Therefore I spent a few days writing a new tool, pydoclint, from scratch: https://github.com/jsh9/pydoclint

I did some benchmarking using pydoclint vs Darglint, and here's the result on linting these 2 famous Python projects:

pydoclint darglint
numpy 2.0 sec 49 min 9 sec (1,475x slower)
scikit-learn 2.4 sec 3 hr 5 min 33 sec (4,639x slower)

So if you and/or your team is going to port the same functionality to ruff, I'd recommend that you either write things from the ground up in Rust, or check out how I implemented it in Python.

akaihola commented 1 year ago

Since darglint is archived, I've forked it as darglint2. The purpose of that project is to maintain a drop-in replacement for darglint with any minimal bugfixes needed to keep it usable for old users who can't spend the effort to migrate to a faster alternative.

Kludex commented 1 year ago

It would be cool to have DAR102: The docstring contains a parameter not in function. 👀

charliermarsh commented 1 year ago

We could probably support that without implementing all of Darglint. We already check that the docstring contains all function parameters -- that's basically the inverse?

Kludex commented 1 year ago

We could probably support that without implementing all of Darglint. We already check that the docstring contains all function parameters -- that's basically the inverse?

Yes. That would be great. 👍

erwann-met commented 1 year ago

Hi @charliermarsh , after I raised this issue (#4362), I looked into Darglint myself. It was extremely slow, and it can actually be unusable for large projects.

Therefore I spent a few days writing a new tool, pydoclint, from scratch: https://github.com/jsh9/pydoclint

I did some benchmarking using pydoclint vs Darglint, and here's the result on linting these 2 famous Python projects: pydoclint darglint numpy 2.0 sec 49 min 9 sec (1,475x slower) scikit-learn 2.4 sec 3 hr 5 min 33 sec (4,639x slower)

So if you and/or your team is going to port the same functionality to ruff, I'd recommend that you either write things from the ground up in Rust, or check out how I implemented it in Python.

Thank you so much @jsh9 for tackling this issue! We've tried your tool along with Ruff in my team and it's very promising so far. We are very impressed that you managed to put it together so fast. For us the replacement of darglint and flake8 is crucial as they currently represent about 60% of our CI costs.

Having said that, I would like to provide some feedback based on our experience using pydoclint. While it has demonstrated great potential, we have encountered a few issues that prevent us from fully adopting it. One particular area that requires attention is the clarity of the error messages. In certain cases, the messages provided are not as informative or explicit as we would like them to be, making it difficult for us to identify the root cause of the issues. Another issue is the lack of ability to ignore some rules either project-wise or directly in the code.

Sadly we prefer waiting a bit for either pydoclint to be mature enough or for Ruff to implement darglint rules.

jsh9 commented 1 year ago

Hi @erwann-met , please feel free to submit new issues over at pydoclint, and I'll take a look.

rbebb commented 1 year ago

Hi @charliermarsh! Just figured I check to see if there's an update on the darglint effort. Thank you again for all of your work!

joelberkeley commented 1 year ago

... We already check that the docstring contains all function parameters ...

This appears to be true only for Google style docstrings (see D417). I'd love for that to be available for sphinx rest style

rbebb commented 1 year ago

Hi @charliermarsh! I figured I'd circle back on this feature request to see if there are any updates on the effort!

zanieb commented 1 year ago

We will report back here if there is progress on this; we have not begun work on this — we have a lot of priorities as we grow!

I'm happy to review contributions of individual rules here.

ddelange commented 1 year ago

fwiw, our setup for consistent google style docstrings using darglint and flake8-docstrings:

setup.cfg

[flake8]
ignore = B902,D10,E203,E501,W503
max-line-length = 88
inline-quotes = double
docstring-convention = google
max-cognitive-complexity = 10

[darglint]
strictness = short

.pre-commit-config.yaml

repos:
-   repo: https://github.com/psf/black
    # when updating this version, also update blacken-docs hook below
    rev: 23.1.0
    hooks:
    -   id: black

-   repo: https://github.com/asottile/blacken-docs
    rev: 1.13.0
    hooks:
    -   id: blacken-docs
        additional_dependencies: ['black==23.1.0']

-   repo: https://github.com/timothycrosley/isort
    rev: 5.12.0
    hooks:
    -   id: isort

-   repo: https://github.com/PyCQA/flake8
    rev: 6.0.0
    hooks:
    -   id: flake8
        additional_dependencies: [
            'darglint~=1.8.1',
            'flake8-absolute-import~=1.0',
            'flake8-blind-except~=0.2.0',
            'flake8-builtins~=2.1',
            'flake8-cognitive-complexity==0.1.0',
            'flake8-comprehensions~=3.2',
            'flake8-docstrings~=1.5',
            'flake8-logging-format~=0.9',
            'flake8-mutable~=1.2.0',
            'flake8-print~=5.0',
            'flake8-printf-formatting~=1.1.0',
            'flake8-pytest-style~=1.7',
            'flake8-quotes~=3.2',
            'flake8-tuple~=0.4.1',
            'pep8-naming~=0.11'
        ]
ddelange commented 12 months ago

Switched from flake8 + black + darglint to ruff + pydoclint.

Note that for google style docstrings:

pyproject.toml

[tool.pydoclint]
style = "google"
arg-type-hints-in-docstring = false
check-return-types = false
check-yield-types = false

[tool.ruff]
select = ["ALL"]
ignore = ["D407", "E501", "ANN", "TRY003", "D203", "D213", "D100", "D104", "D106", "FIX002", "ERA001", "RUF012"] # ignores: D407 (we have google style docstrings), E501 (we have black), ANN (we have mypy), TRY003 (there is EM102), D203 (there is D211), D213 (there is D212), D100,D104 (we don't publish publick readthedocs), D106 (we have Django Meta class that doesn't need a docstring), FIX002 (we have TD002,TD003), ERA001 (we sometimes leave code for reference), RUF012 (django is full of class vars that are lists or dicts)
target-version = "py311" # not needed if your pyproject.toml has `project.requires-python`

[tool.ruff.extend-per-file-ignores]
"__init__.py" = ["E401", "E402"]
"**/tests/**/*.py" = ["S101", "PT009"] # ignores: S101 (assert is fine in tests), PT009 (we use django unittest framework)
"**/migrations/*.py" = ["D101"]

.pre-commit-config.yaml

repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
  rev: v0.1.0
  hooks:
  - id: ruff
    args: [--fix, --exit-non-zero-on-fix]
  - id: ruff-format

# should be replaced in the future ref https://github.com/astral-sh/ruff/issues/458
- repo: https://github.com/jsh9/pydoclint
  rev: 0.3.4
  hooks:
  - id: pydoclint

# should be replaced in the future ref https://github.com/astral-sh/ruff/issues/3792
- repo: https://github.com/asottile/blacken-docs
  rev: 1.13.0
  hooks:
  - id: blacken-docs
    additional_dependencies: ['black==23.10.0']
BigForLy commented 10 months ago

Hello @charliermarsh! I manage to convert almost all checks to ruff, except for Darling, do you have any idea whether this will be done?

strickvl commented 7 months ago

As mentioned in Discord, registering my interest in taking a stab at this.

Spenhouet commented 6 months ago

Using pydocstyle we are using the google style with typing. We strictly use typing in python, so repeating the types in the Doc string serves no purpose and is just redundant. Having rules like DAR103: The docstring parameter type doesn't match function. or DAR105: The docstring parameter type is malformed. certainly would help to keep Doc string and parameter declaration aligned, it's still redundant and we'd rather enforce that no types are specified in the Doc string. Seems darglint doesn't provide such a rule.

Edit: Just found this issue which raises the same wish: https://github.com/astral-sh/ruff/issues/9070

AloizioMacedo commented 6 months ago

Hoping to see this as well! Ruff already encompasses most of my CI linting checks, this would be a great addition.

Meanwhile, in case it is useful for other people, I decided to take a stab and implement a version that is sufficient for my needs: https://github.com/AloizioMacedo/pystaleds.

It intends primarily to check for mismatches of the parameters in the function signature and the arguments in the docstring. It is very experimental, but I can already use it in my projects while ruff doesn't support a similar feature : )

ddelange commented 6 months ago

we'd rather enforce that no types are specified in the Doc string

@Spenhouet fwiw pydoclint has an --arg-type-hints-in-signature flag ref https://jsh9.github.io/pydoclint/violation_codes.html

see also my config above

Spenhouet commented 6 months ago

@ddelange Thanks for sharing, that would introduce a new tool just for this small requirement. Would much more prefer if this would be covered by a linting rule within ruff.

aeturrell commented 5 months ago

Just wanted to add my voice to those saying that this would be a really useful feature. Thanks for all the amazing work on Ruff!

tmke8 commented 5 months ago

I wonder if at this point, it would be better to implement pydoclint in ruff instead of darglint? (This is prompted by the recent PR #11471.)

Pydoclint has

Goldziher commented 3 months ago

I wonder if at this point, it would be better to implement pydoclint in ruff instead of darglint? (This is prompted by the recent PR #11471.)

Pydoclint has

  • more error codes (e.g., DOC104 in pydoclint checks for the order of the parameters; darglint doesn't have that AFAICT),
  • more config options (like arg-type-hints-in-signature which lets you decide whether you want to include types in the docstring),
  • and the config options of pydoclint also seem more well-thought-out to me than darglint's. (E.g., pydoclint has skip-checking-short-docstrings which makes sense to me, but darglint's strictness=short vs strictness=long is a bit confusing)

Strong +1 for this. pydoclint is really great and pretty fast already, but ruff could improve on this and extend it further. Also some issues might be autofixable.

charliermarsh commented 3 months ago

Are there rules that are present in darglint but not pydoclint?

Apakottur commented 3 months ago

Are there rules that are present in darglint but not pydoclint?

I know of at least two:

  1. DAR401: The docstring is missing an exception raised.
  2. DAR402: The docstring describes an exception not explicitly raised.

These two rules are the only reason we use darglint, everything else that we want is covered by pydoclint.

charliermarsh commented 3 months ago

Interesting, ok. Those are the exact rules added in https://github.com/astral-sh/ruff/pull/11471.

jsh9 commented 3 months ago

Are there rules that are present in darglint but not pydoclint?

I know of at least two:

  1. DAR401: The docstring is missing an exception raised.
  2. DAR402: The docstring describes an exception not explicitly raised.

These two rules are the only reason we use darglint, everything else that we want is covered by pydoclint.

I'm the author of pydoclint.

I think these two pydoclint violation codes cover DAR401 and 402:

Code Explanation
DOC501 Function/method has “raise” statements, but the docstring does not have a “Raises” section
DOC502 Function/method has a “Raises” section in the docstring, but there are not “raise” statements in the body

If they don't cover what you want, please feel free to open an issue.

AndydeCleyre commented 3 months ago

@jsh9 I don't think those check the exception types. For example, if I have a function like this, which sometimes raises a ValueError, but incorrectly document that it raises a ZeroDivisionError, darglint will report the problem while pydoclint won't:

def _str_to_bool(informal_bool: str) -> bool:
    """
    Translate a commonly used boolean ``str`` into a real ``bool``.

    Args:
        informal_bool: A boolean represented as ``str``, like ``"true"``, ``"no"``, ``"off"``, etc.

    Returns:
        ``True`` or ``False`` to match the intent of ``informal_bool``.

    Raises:
        ZeroDivisionError: This doesn't look like enough like a ``bool`` to translate.
    """
    if informal_bool.lower() in ('true', 't', 'yes', 'y', 'on', '1'):
        return True
    if informal_bool.lower() in ('false', 'f', 'no', 'n', 'off', '0'):
        return False
    raise ValueError(f"{informal_bool} doesn't look like a boolean")  # pragma: no cover
$ darglint nt2
nt2/casters.py:_str_to_bool:29: DAR401: -r ValueError
nt2/casters.py:_str_to_bool:39: DAR402: +r ZeroDivisionError

$ pydoclint nt2
Loading config from user-specified .toml file: pyproject.toml
No config found in pyproject.toml.
Skipping files that match this pattern: \.git|\.tox
nt2/__init__.py
nt2/casters.py
nt2/converters.py
nt2/dumpers.py
nt2/ui.py
nt2/yamlpath_tools.py
🎉 No violations 🎉
jsh9 commented 3 months ago

Thanks @AndydeCleyre !

I've opened a new issue (https://github.com/jsh9/pydoclint/issues/158) on pydoclint to add this feature.