Open profgav opened 4 years ago
hey @profgav (cc @gavinhuttley) - I definitely like this idea. In investigating this idea, it seems like this is will take a bit of work. interrogate
just uses the ast
which doesn't include comments, so I'll have to tokenize source files (and then have the joy of aligning tokenized files with ast parsed files 😆 ). I'll try to slate this for 1.4.0 release.
Thanks for the idea!
If no one is working on it I can try my hands around it. @econchick 😃
@Pradhvan 💯! this issue is yours if you want it! Very much appreciated - thank you!
Great! I Will try my best 🙌🏾
I tried going through the code base but felt a bit overwhelmed. @econchick if it's not too much of a trouble, can you give out some pointers on where to start and what would be have been your approach. Even as small as a tl'dr type short text would help a lot.
Hey @Pradhvan, sure!
So interrogate
uses the ast
library in stdlib (in src/interrogate/visit.py
). It uses ast
to traverse a code base, (via visit_<Noun>Def
methods), and with each object we care about (classes, methods/functions, modules, etc), we call get_docstring to see if the object we care about has a docstring.
The idea for this PR is to be able to comment on an object for interrogate to ignore, something like:
def my_undocumented_function(): # nodocqa
return "foo"
However, the ast
library doesn't include/preserve comments, so there's no way to detect that # nodocqa
comment for an object (see stack overflow post). So in order to do this, we have to either (1) add another library to find those comments to use in conjunction with how we traverse a code base using ast
, or (2) replace our use ast
with another library (not sure what exists out there) that does all that we want. Both sound like a bit of an undertaking 😬
The aforementioned stack overflow post has answers suggesting the use of the tokenize
library in the stdlib along with ast
, so I might start to figure out when to use ast
and when to use tokenize
; for instance, when the ast visitor visits a class, would it be easiest to tokenize it and find someway to see if there's a nodocqa
comment? Or maybe we have to tokenize an entire file/module (looks like it, as we do with ast
parsing), so how do we zip the results together?
We could also take a look at how black
handles comments, since it supports # fmt: off
/ # fmt: skip
/# fmt: off
. Or how coverage supports # pragma: no cover
and flake8 with # noqa
- do they use tokenize? something else?
In general, I'm not exactly sure how to approach this without digging further. It seems like changes would need to be in src/interrogate/visit.py
somewhere, with an additional library (stdlib or otherwise) to help pull out comments.
Once something is implemented, I would then time it against the code base itself, maybe the test fixtures, and other code bases I have access to (i.e. whatever's local). I would be very cautious because we wouldn't want to add too much more overhead and take a performance hit since there's a possibility that there can be a lot of duplicate processing. Something like +15% longer may be fine, but not like 100% longer for interrogate
to run against a code base.
Hope this helps!
@econchick thank you for the detailed explanation. This definitely helped make more sense of what I need to do. 🙌🏾
FYI, in case this is still open for input, pdoc3 uses the special variable __pdoc__
instead of comments.
Awesome, extremely useful project, thank you!! (Just noticed I'm signed in using a GH account I created strictly for teaching purposes .. doh! My normal handle is @gavinhuttley).
Describe the feature you'd like
I have cases where we use decorators to extend docstrings from other methods / classes, etc.. Rather than try to catch such a special case, perhaps adding support for a comment annotation (like
# noqa
, as used by linters) that indicates a method / function to be ignored?Is your feature request related to a problem?
Only in that methods documented in this way are not recognised as such.
Your Environment
interrogate
version(s) (interrogate --version
): 1.2.0Additional context
None.