econchick / interrogate

Explain yourself! Interrogate a codebase for docstring coverage.
https://interrogate.readthedocs.io
MIT License
574 stars 44 forks source link

Support skip comment markup like # noqa #41

Open profgav opened 4 years ago

profgav commented 4 years ago

Awesome, extremely useful project, thank you!! (Just noticed I'm signed in using a GH account I created strictly for teaching purposes .. doh! My normal handle is @gavinhuttley).

Describe the feature you'd like

I have cases where we use decorators to extend docstrings from other methods / classes, etc.. Rather than try to catch such a special case, perhaps adding support for a comment annotation (like # noqa, as used by linters) that indicates a method / function to be ignored?

Is your feature request related to a problem?

Only in that methods documented in this way are not recognised as such.

Your Environment

Additional context

None.

econchick commented 4 years ago

hey @profgav (cc @gavinhuttley) - I definitely like this idea. In investigating this idea, it seems like this is will take a bit of work. interrogate just uses the ast which doesn't include comments, so I'll have to tokenize source files (and then have the joy of aligning tokenized files with ast parsed files 😆 ). I'll try to slate this for 1.4.0 release.

Thanks for the idea!

Pradhvan commented 3 years ago

If no one is working on it I can try my hands around it. @econchick 😃

econchick commented 3 years ago

@Pradhvan 💯! this issue is yours if you want it! Very much appreciated - thank you!

Pradhvan commented 3 years ago

Great! I Will try my best 🙌🏾

Pradhvan commented 3 years ago

I tried going through the code base but felt a bit overwhelmed. @econchick if it's not too much of a trouble, can you give out some pointers on where to start and what would be have been your approach. Even as small as a tl'dr type short text would help a lot.

econchick commented 3 years ago

Hey @Pradhvan, sure!

So interrogate uses the ast library in stdlib (in src/interrogate/visit.py). It uses ast to traverse a code base, (via visit_<Noun>Def methods), and with each object we care about (classes, methods/functions, modules, etc), we call get_docstring to see if the object we care about has a docstring.

The idea for this PR is to be able to comment on an object for interrogate to ignore, something like:

def my_undocumented_function():  # nodocqa
    return "foo"

However, the ast library doesn't include/preserve comments, so there's no way to detect that # nodocqa comment for an object (see stack overflow post). So in order to do this, we have to either (1) add another library to find those comments to use in conjunction with how we traverse a code base using ast, or (2) replace our use ast with another library (not sure what exists out there) that does all that we want. Both sound like a bit of an undertaking 😬

The aforementioned stack overflow post has answers suggesting the use of the tokenize library in the stdlib along with ast, so I might start to figure out when to use ast and when to use tokenize; for instance, when the ast visitor visits a class, would it be easiest to tokenize it and find someway to see if there's a nodocqa comment? Or maybe we have to tokenize an entire file/module (looks like it, as we do with ast parsing), so how do we zip the results together?

We could also take a look at how black handles comments, since it supports # fmt: off / # fmt: skip/# fmt: off. Or how coverage supports # pragma: no cover and flake8 with # noqa - do they use tokenize? something else?

In general, I'm not exactly sure how to approach this without digging further. It seems like changes would need to be in src/interrogate/visit.py somewhere, with an additional library (stdlib or otherwise) to help pull out comments.

Once something is implemented, I would then time it against the code base itself, maybe the test fixtures, and other code bases I have access to (i.e. whatever's local). I would be very cautious because we wouldn't want to add too much more overhead and take a performance hit since there's a possibility that there can be a lot of duplicate processing. Something like +15% longer may be fine, but not like 100% longer for interrogate to run against a code base.

Hope this helps!

Pradhvan commented 3 years ago

@econchick thank you for the detailed explanation. This definitely helped make more sense of what I need to do. 🙌🏾

dAnjou commented 2 years ago

FYI, in case this is still open for input, pdoc3 uses the special variable __pdoc__ instead of comments.