exercism / python

Exercism exercises in Python.
https://exercism.org/tracks/python
MIT License
1.87k stars 1.26k forks source link

Docstring style/format is not consistent #2974

Closed IsaacG closed 2 years ago

IsaacG commented 2 years ago

The docstring param and return blocks vary from file to file. See:

Β» grep -hr -e ':return: [^ ]\+ ' -e ':param [^ ]\+:' | sed 's/^ *//' | sed 's/\(\([^ ]* \)\{5\}\).*/\1/'| sort -u
:param amount: Amount of seats
:param appetizers: list of appetizer
:param azara_record: tuple - a
:param budget: float - amount
:param budget: float - the
:param card: str - given
:param combined_record_group: tuple of tuples
:param coordinate: str - a
[...]
:return: int amount of prep
:return: integer count of student
:return: int - index at
:return: int - maximum value
:return: int - non-exchangeable value.
:return: int - number of
:return: int - number raised
:return: int remaining bake time
:return: int - the value
[...]

(Full output)

The predominant style appears to be:

:param <name>: <type> - <description>
:return: <type> - <description>

where the description begins with a lowercase letter and ends with a period.

Happy to send a PR if this is acceptable.

github-actions[bot] commented 2 years ago

πŸ€–   πŸ€–

Hi! πŸ‘‹πŸ½ πŸ‘‹ Welcome to the Exercism Python Repo!

Thank you for opening an issue! 🐍  πŸŒˆ ✨


​          ◦ If you'd also like to make a PR to fix the issue, please have a quick look at the Pull Requests doc.
             We  πŸ’™  PRs that follow our Exercism & Track contributing guidelines!


πŸ’›  πŸ’™  While you are here... If you decide to help out with other open issues, you have our gratitude πŸ™Œ πŸ™ŒπŸ½.
Anything tagged with [help wanted] and without [Claimed] is up for grabs.
Comment on the issue and we will reserve it for you. 🌈 ✨

BethanyG commented 2 years ago

@IsaacG -- I think these are largely fine as they stand. As long as they generally follow this format:

def some_function(some_parameter):
    """Function purpose in the form of a command ending in a period.

    :param <name>: <type> - <description>
    :return: <type> - <description>

    Additional notes, as needed.  Especially clarifications for error messaging or edge cases.
    """
    ....

    return <some function return value>

TL;DR: I think it's too much to be ridged around the fine details, as long as we stay within what PEP257 and PEP8 describe and additionally have some consistency with how we mark out the :params: and :returns:.

As the intro to PEP257 says:

The aim of this PEP is to standardize the high-level structure of docstrings: what they should contain, and how to say it (without touching on any markup syntax within docstrings). The PEP contains conventions, not laws or syntax.

β€œA universal convention supplies all of maintainability, clarity, consistency, and a foundation for good programming habits too. What it doesn’t do is insist that you follow it against your will. That’s Python!” β€”Tim Peters on comp.lang.python, 2001-06-16

I consider the - and the <description> optional (less optional in the docstrings of early concept exercises).

I do think that having an additional colon after the :param: or :return: <type> is hard to read, makes things confusing, and is not mentioned in any of the docstring formats I took a quick look at. I did see both single and double dashes used in both PEP8 & 257.

Not every concept exercise is going to have these docstrings stubs. We'll probably not have them after the classes exercise - or they will be less verbose. Eventually, we'd like students to make ones of their own. But until/unless we write specific exercises around a specific style or format of docstrings, I think we shouldn't be enforcing more than the general PEP257 format (with the demarcations for params and returns noted previously).

So I don't think a PR from you for concept exercises is needed at this time. @Metallifax has taken on reviewing docstrings for concept exercises as time allows (adding in summary sentences where needed), and can clean up the small set of cases where <type> has been omitted, or some other issue arises.

Metallifax commented 2 years ago

Hey @IsaacG, I'll keep an eye out for missing types in the param/return lines as @BethanyG said when I have the time to devote to the repository again, which should be pretty soon (recent midterms ate up a lot of my time). If there's a way to extend pydocstyle or any other docstring linter to check for missing types, that'd be our best bet for uniformity from exercise to exercise (in my opinion), I just haven't found anything like that yet and pydocstyle seems to not care about if types are there or not in its current configuration. Thanks for pointing this out though and I'll make a note of this for future PRs. Cheers.

BethanyG commented 2 years ago

@Metallifax - if/when either you or I have time, maybe we can take a look at some of the programs in the documentation generation space for Python. In particular:

Sphinx pydoctor pdoc doxygen

Also: What the Python Tutorial Says

But also consumers of the generators: docusaurus ReadtheDocs MkDocs

Sphinx-RTD style comes the closest (ish) right now to what we are doing - but that doesn't necessarily mean we want to follow it. For one, separating the type into its own line is quite hard to read, and the whole format suffers from an extreme excess of periods and colons. 😱

The intricacies of machine processing and auto-generation are fairly burdensome to someone who's learning to code in Python, and there is a high likelyhood that the very next project or team they are on will require something different. Cases in point:

So it does feel like a sort of "losing battle" beyond enforcing a few points. That being said, I can update this issue with any tooling I find that may or may not be helpful.

IsaacG commented 2 years ago

Would recording an official preferred style be worth having?

BethanyG commented 2 years ago

Not at the moment. I think we're not quite there yet. Maybe after this last pass by @Metallifax, and some noodling on what we might want to cover in a doctstring and doctest series of concept exercises.

But even then, it would be for track exemplar code and (selected) stub files. I am not about to go and require them on all submissions or mentor notes. And we will NOT be putting any in test files - that would really screw up the code that gets displayed on the website currently.

The overarching message I 'd like to convey to students is that having doctstings that are useful to those reading the code later are a really good thing. And that generally, it is a really good idea to follow conventions in PEP257. The earlier you have the habit, the easier it is. Like having unit tests, it shows you are a good developer.

Having them written so that documentation is automated is πŸ¦„ ✨, and the hallmark of a stellar dev team -- but also (like working with unicode) fraught with complication the further you dig.

IsaacG commented 2 years ago

Sorry if I wasn't clear. I definitely didn't mean to imply that we should be pushing a style for student submissions! I was thinking it'd be helpful for stubs, examples and exemplars to keep them consistent. Thanks!

BethanyG commented 2 years ago

Same answer. Not there yet.

Metallifax commented 2 years ago

@BethanyG Sorry if I left ya hanging for a minute, I just wanted to research the subjects here and devote some time to the topic.

if/when either you or I have time, maybe we can take a look at some of the programs in the documentation generation space for Python.

Will this be a part of a future exercise or is this just so we can hone in on our style eventually? Or more exciting, a documentation site for the repository to make use of our new found documentation skills πŸ€”?

The intricacies of machine processing and auto-generation are fairly burdensome to someone who's learning to code in Python, and there is a high likely-hood that the very next project or team they are on will require something different.

Then if you'd like my nooby advice, I'd go with the fewest steps possible for the student, which PDoc does well. I was able to generate some nice looking html files with a simple one liner inside just a regular old exercise that a student would receive via the exercism cli:

pdoc --html . --output-dir=./docs

The problem is that PDoc only works with Google and Numpy format, and while they support reST directives, they seem to have some issues displaying reST docstring format at the moment and have an outstanding issue since 2020 where they've implementing them.

In the meantime, I'll just keep trucking along as you say and keep an eye on the issue while we figure this out and follow your format suggestion in your first reply. I'll also keep an eye out for tooling as well and update the thread with some good contenders.