Format docstrings to facilitate reading

joelostblom commented 4 years ago

Reformatting the docstrings could make them easier to read. This is how the lineplot docstring renders in jupyterlab:

It is very difficult to read the "Arguments" section, mainly due to that the argument is indented rather than its description and that newlines are inserted in the middle of words so that they are split over two lines.

The image below shows what is the standard for many packages. Here, it is easy to skim the "Parameters" section to find the parameter name of interested, and the chunk of text describing that parameter is clearly indented. Words are never split over multiple lines.

I realize that the docstrings in plotly are not hardcoded for each function, like in other packages, so I don't know if a formatting change is possible, but it would greatly improve the reading experience if it was possible to change the docstring indentation.

nicolaskruchten commented 4 years ago

Thanks for the feedback! I very much agree :)

Actually the docstrings for PX are auto-generated, so a fix might be pretty easy if you want to give it a shot... the relevant code is here: https://github.com/plotly/plotly.py/blob/master/packages/python/plotly/plotly/express/_doc.py#L376

joelostblom commented 4 years ago

Thanks for pointing me to the right direction! As you said the reformatting is pretty straightforward. Here are three suggestions I came up with, let me know what you prefer and I can create a PR.

Split the parameter name onto its own line. Although this is consistent with how the graph objects' docstrings are, there is an important difference: graph object don't have the parenthesis indicating type and default values, which does not look that neat having mixed into the text.
Add the parenthesized sentence on the same line as the parameter name. Since the parentheses are so long, this approaches loses the clear segmentation between parameters.
Only add the parameter type to the same line as the parameter name. The retains the clear parameter sections and formatting the parenthesis into a sentence makes it look tidier. This is consistent with the plotly.subplots.make_subplots docstring (and many other packages so readers would be familiar with it). This relies on docstrings being written the same way (which seems like a reasonable criteria) because I split on the : in the parenthesis (I still have to go through and see if there are any notable exceptions).

In both 2 and 3, I would remove the space after the parameter name, in front of the :. Spoiler, my favorite below:

My choice would be number...

3

nicolaskruchten commented 4 years ago

Thanks for the analysis! Option 3 is probably the best, I agree, although some rewriting will be necessary for the ones that accept multiple types (like x in your screenshots) to make it clear which part of the description applies to which type (i.e. if it's a string or int then it's a name or index into data_frame, otherwise it's used as-is). Also note that the current set of docstrings isn't all that consistently written (sorry!) so something more complicated might be needed than a split on colon. You may want to make the lists in the big dict be a bit more structured i.e. the first element is the one that's on the same line, and the rest aren't.

Thanks for being willing to contribute, it's much-appreciated :)

joelostblom commented 4 years ago

Sounds good, thanks for letting me contribute! I'll make a PR tonight with a suggestion including a minor rewrite of some docstrings and changed formats in the source.

Do you want to keep the parentheses in the source or can I remove them? I think I would ideally rewrite with the parameter types and default values in the first list item and explain them in the folling list items similar to what you indicated.

On Sat, Oct 19, 2019, 11:23 Nicolas Kruchten notifications@github.com wrote:

Thanks for the analysis! Option 3 is probably the best, I agree, although some rewriting will be necessary for the ones that accept multiple types (like x in your screenshots) to make it clear which part of the description applies to which type (i.e. if it's a string or int then it's a name or index into data_frame, otherwise it's used as-is). Also note that the current set of docstrings isn't all that consistently written (sorry!) so something more complicated might be needed than a split on colon. You may want to make the lists in the big dict be a bit more structured i.e. the first element is the one that's on the same line, and the rest aren't.

Thanks for being willing to contribute, it's much-appreciated :)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/plotly/plotly_express/issues/154?email_source=notifications&email_token=ABCZJOOMSBRUCQHGUMLDD4LQPNGCVA5CNFSM4JCNZXVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBXZODY#issuecomment-544184079, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCZJOLI7QSOZZ3HMBTN4KLQPNGCVANCNFSM4JCNZXVA .

nicolaskruchten commented 4 years ago

I’m not that attached to the parentheses so it makes sense to get rid of them to me. It’s important to keep some clear indication of the default value, but that can go into the main body block.

Thanks again!

plotly / plotly_express

Format docstrings to facilitate reading #154