NiklasRosenstein / pydoc-markdown

Create Python API documentation in Markdown format.
http://niklasrosenstein.github.io/pydoc-markdown/
Other
449 stars 104 forks source link

improvement: Implement support for NumPy-style docstrings #279

Open celsiusnarhwal opened 1 year ago

celsiusnarhwal commented 1 year ago

This PR implements support for NumPy-style docstrings via the new NumpyProcessor class. It does so with the help of the numpydoc package, on which this PR makes Pydoc-Markdown dependent.

In addition to the above, this PR:

This PR resolves #251.

Caveats and Limitations

Examples

Here are examples of how the various sections of a NumPy-Style docstring are rendered by NumpyProcessor.

Summary / Extended Summary The Summary and Extended Summary are rendered together as a single summary. ### Input ``` Decode a string by shifting each character by a given offset. Extended Summary ---------------- There's not much else to say about this function, but if there was, it would go here. Fun fact: you don't need to include the Extended Summary heading — if your summary spans multiple lines, everything after the first will be implicitly considered to be the Extended Summary. You can't have both an implicit *and* explicit Extended Summary, though — that causes an exception! ``` ### Output Decode a string by shifting each character by a given offset. There's not much else to say about this function, but if there was, it would go here. Fun fact: you don't need to include the Extended Summary heading — if your summary spans multiple lines, everything after the first will be implicitly considered to be the Extended Summary. You can't have both an implicit *and* explicit Extended Summary, though — that causes an exception!
Parameters / Other Parameters / Attributes / Recieves The Parameters, Other Parameters, Attributes, and Receives sections are all rendered similarly. ### Input ``` Parameters ---------- string : str The string to decode. Other Parameters ---------------- offset : int The offset by which to shift each character in the string. Defaults to 13. Attributes ---------- attr : Any Functions don't have attributes, but if we were documenting a class, we'd put its attributes here. Unfortunately, we are not. Too bad! Receives -------- param : Any If this was a generator, we'd document the parameters passed to it's `send()` method here. Unfortunately, it is not. Too bad! ``` ### Output **Arguments** * **string** (`str`): The string to decode. * **offset** (`int`): The offset by which to shift each character in the string. Defaults to 13. **Attributes** * **attr** (`Any`): Functions don't have attributes, but if we were documenting a class, we'd put its attributes here. Unfortunately, we are not. Too bad! **Receives** * **param** (`Any`): If this was a generator, we'd document the parameters passed to it's `send()` method here. Unfortunately, it is not. Too bad!
Returns / Yields The Returns and Yields sections are rendered similarly. ### Input ``` Returns ------- str The decoded string. Yields ------ char : str The decoded string, one character at a time. By the way, you can optionally annotate your return and yield values with names like I did here. The type annotation isn't optional, though. ``` ### Output **Returns** * `str`: The decoded string. **Yields** * **char** (`str`): The decoded string, one character at a time. By the way, you can optionally annotate your return and yield values with names like I did here. The type annotation isn't optional, though.
Raises / Warns The Raises and Warns sections are rendered similarly. ### Input ``` Raises ------ ValueError If the string contains non-alphabetic characters. Warns ----- UserWarning If I don't like you. ``` ### Output **Raises** * `ValueError`: If the string contains non-alphabetic characters. **Warns** * `UserWarning`: If I don't like you.
See Also ### Input ``` See Also -------- :func:`encode` Encode a string by shifting each character by a given offset. ``` ### Output **See Also** * :func:\`encode\`: Encode a string by shifting each character by a given offset. *(The processor leaves the task of cross-referencing functions, classes, and methods in this section to Pydoc-Markdown's existing faculties.)*
Notes ### Input ``` Notes ----- This function implements an inverse substitution cipher[1]_. ``` ### Output **Notes** This function implements an inverse substitution cipher1.
References ### Input ``` References ---------- .. [1] https://en.wikipedia.org/wiki/Substitution_cipher ``` ### Output **References** 1. https://en.wikipedia.org/wiki/Substitution_cipher
Examples The Examples section supports [doctests](https://docs.python.org/3/library/doctest.html). The processor renders doctests in code blocks and other content as plain text. The processor considers the start of a doctest to be marked by a line beginning with `>>>` and the end of a doctest to be marked by a blank line. If multiple doctests are present, they are rendered in separate code blocks. ### Input ``` Examples -------- >>> decode("Qba'g nfx fghcvq dhrfgvbaf!") "Don't ask stupid questions!" This is a super simple function so I don't really know why you'd need more than one example but here's another one anyway. >>> decode("Gunax lbh xvaqyl sbe lbhe nggragvba!") "Thank you kindly for your attention!" ``` ### Output **Examples** ```python >>> decode("Qba'g nfx fghcvq dhrfgvbaf!") "Don't ask stupid questions!" ``` This is a super simple function so I don't really know why you'd need more than one example but here's another one anyway. ```python >>> decode("Gunax lbh xvaqyl sbe lbhe nggragvba!") "Thank you kindly for your attention!" ```
NiklasRosenstein commented 1 year ago

Hey @celsiusnarhwal, thanks for this great PR! I'll be able to take a closer look at it next week.

NiklasRosenstein commented 1 year ago

Hey @celsiusnarhwal, sorry for the silence. I'm finally finding some time again to look at your PR

I've made some minor adjustments, and I'd almost be happy to merge it as it is now! Only that there are two unit tests failing because the NumpyProcessor identifies the examples below as seemingly being of the Numpy doc format when in reality they're not and they don't really get processed as a consequence.

E.g. for the test_pydocmd_processor test:

# Arguments
s (str): A string.
b (int): An int.

It spits the same back out. I've added some logging so we can tell which processor the SmartProcessor is delegating to:

INFO     pydoc_markdown.contrib.processors.smart:smart.py:92 Using `numpy` processor for Module `test` (detected)

NumpyProcessor.check_docstring_format() returns True if a docstring passes numpydoc's docstring validator without warnings or errors and False otherwise

I'm also thinking that this on the other may be too restrictive. If I want to use the Numpy docstring format, I may still make mistakes, and I'd actually want it to be identified as Numpy docstring format regardless of whether I have a minor mistake in my docstring formatting. Getting a warning (although maybe not an exception) in this case would be desirable.

What do you think about checking for the presence of Numpy-doc-like sections (e.g. Raises\n-------) in the content of the docstring instead?