psf / black

The uncompromising Python code formatter
https://black.readthedocs.io/en/stable/
MIT License
39.09k stars 2.47k forks source link

Black should have an opinion about doc strings #144

Closed Lukas0907 closed 4 years ago

Lukas0907 commented 6 years ago

Operating system: Ubuntu 16.04 Python version: 3.6.1 Black version: master Does also happen on master: yes

Hi,

currently Black doesn't seem to have an opinion about doc strings or rather where the quotes should go.

Black claims for this file for example that it is already formatted:

def test():
    """This is one possibility.

    Very important stuff here.
    """

def test2():
    """This is another one possibility.

    Very important stuff here."""

def test3():
    """
    This is another one possibility.

    Very important stuff here.
    """

def test4():
    """
    What about this?
    """

def test5():
    """What about this?

    Some people also like to have an empty line at the end of the doc string.

    """

The tool pydocstyle (in the spirit of PEP-0257) at least complains that the closing quotes should go on a separate line and if the docstring fits one line it should only span one line.

It would be nice if that could be incorporated in Black as well.

Thanks for the great work!

Lukas.

rinslow commented 6 years ago

I will suggest my opinion:

"""One-liner.

Continuation for the one-liner. (Optional)

Title: |4 SPACES|one line. might be really really really really really really really really
|8 SPACES| really long |4 SPACES| next line (something): something something

Note: |4 SPACES| this is just my opinion but obviously it's the best one """

And also: """One-liner documentation is closed on the same-line."""

ambv commented 6 years ago

So I came here to complain that docstrings are out of scope but reading your comment actually makes me think what you're asking for is reasonable :smile:

Specifically:

I'll think about possible negative effects some more before deciding whether to do this for sure but it looks like some opinions here might be harmless.

@carljm, what do you think?

carljm commented 6 years ago

I think it's reasonable for Black to be opinionated about this, and I think the two rules you named are the right ones to enforce. I would extend the second rule to also have an opinion about opening quotes; I slightly prefer both opening and closing quotes on their own line for a multi-line docstring, but wouldn't object to either variant.

I'm not sure why, but I feel like consistent docstring formatting actually has a huge impact on whether code feels consistently formatted. I guess because they tend to be visibly syntax-highlighted large blocks of text.

natecox commented 6 years ago

I tend to favor single line where applicable:

def somefunc():
    """This is my docstring"""

Multi-line I'd personally prefer the first line be on a new line (i.e., test3 above) but I wouldn't complain about the test format above either.

skapil commented 6 years ago

@ambv Not sure, if anyone is looking into it. Can I give a try to this?

ambv commented 6 years ago

Of course, go for it!

willingc commented 6 years ago

Agree with the approach @ambv and @carljm put forth. Like Carl, I like the longer docstrings to have """ on their own lines. I often put a blank line before the closing """ as it adds a bit of whitespace but it's totally fine not to have the blank line.

coxley commented 6 years ago

+1 to prefer single line if possible, otherwise both opening and closing quotes on their own lines. :)

sersorrel commented 6 years ago

Just to throw it out there, I personally prefer PEP 257 style, and I think that's the right thing to do if Black is aiming to follow PEP 8:

"""if it all fits on one line"""

"""if not: some title

more text
"""

but I've seen a wide variety of other styles in the wild, including these two which I don't think have been mentioned:

""" spaces either side """

"""two line one-liner
"""

I do kind of like having a blank line before the closing quotes, as it makes wrapping the body of the docstring much easier in Vim, though it's probably better to write a Vim plugin that fixes this in Vim rather than influencing Black's style.

carljm commented 6 years ago

@anowlcalledjosh PEP 257 is explicitly agnostic about the placement of opening quotes in a multi-line docstring. To quote: "The summary line may be on the same line as the opening quotes or on the next line."

So Black can choose either of these and be equally PEP 257 compliant:

"""
Summary line.

More lines.
"""

or

"""Summary line.

More lines.
"""

My (weak) preference for the former is based on allowing three more characters in the summary line before hitting a line-length limit.

Interestingly, on the question of blank line before closing quotes, PEP257 requires it on class docstrings but not function docstrings ("Insert a blank line after all docstrings (one-line or multi-line) that document a class") because the docstring should be spaced from the first method. Black also adds a blank line (outside the docstring) before the initial method, though, which PEP257 probably doesn't anticipate. (EDIT on second read, it's not entirely clear if the PEP means for this blank line to be inside or outside the quotes; if outside, then that matches what Black already will do.)

ambv commented 6 years ago

it's not entirely clear if the PEP means for this blank line to be inside or outside the quotes; if outside, then that matches what Black already will do.

From my read I thought I'm doing what the PEP is suggesting.

natecox commented 6 years ago

I'm moderately certain PEP is telling us to add a blank line after the closing quotes

lithammer commented 6 years ago

Worth thinking about as well is that the docstring format will have impact on __doc__. So for example this docstring:

"""
Summary line.

More lines.
"""

Will produce this __doc__:

'\n    Summary line\n\n    More lines.\n    '

While if you start the summary line immediately you don't get the \n at the start of the __doc__ attribute.

ambv commented 6 years ago

@renstrom, that used to bother me in the past when help() used to write out the contents of docstrings without calling lstrip() on it first. Now it's less of an issue and tools like PyCharm already correctly strip left whitespace, too.

brianv0 commented 6 years ago

Support was recently merged into pycodestyle to support a doc string line length which differs from that of the code line length in PyCQA/pycodestyle#674. It'd be nice to have some tooling and an opinion around that.

The primary use case is to comply with a 72 character docstring limit in pep8, but it's also immensely useful if you allow line lengths of 120 but just want to make sure your docstrings are below 80 so it plays nicely with pydoc.

sginn commented 6 years ago

In response to https://github.com/ambv/black/issues/144#issuecomment-387883874, The blank line should be removed if it's a docstring for a function, but retained for a class.

After running black on our codebase, Flake8 is complaining about D202 No blank lines allowed after function docstring.

def req_required(method=None):
    """Configure reqparse for the request."""

    def decorator(f):
        pass

Black is inserting that blank line between the docstring first line, but shouldn't, because req_required is not a class.

from PEP-257

Insert a blank line after all docstrings (one-line or multi-line) that document a class -- generally speaking, the class's methods are separated from each other by a single blank line, and the docstring needs to be offset from the first method by a blank line.

     ....
     Notes:
        ...
        - There's no blank line either before or after the docstring.```
ambv commented 6 years ago

Black differentiates inner functions from the rest of the body. If you had other instructions there. It wouldn't put empty lines.

dirn commented 6 years ago

This could be tricky when writing a decorator.

def decorator(f):
    """A decorator."""
    @wraps(f)
    def inner():
        return f()
    return inner

Black will add a line after the docstring, causing pydocstyle to report a D202. It sounds like the way to prevent this is to put something else between the two defs, but there isn't always anything to put there as the outer function often just returns the inner one. (I've tried both with and without wraps and the result is the same.)

auscompgeek commented 6 years ago

For those waiting on this, docformatter might interest you: https://github.com/myint/docformatter

TiemenSch commented 6 years ago

Regardless of which layout you choose, I think too many debatable cases will arise if Black would try to enforce a line-length in your docstrings' bodies, for example. Some examples concerning the body text:

Gets messy real quick :)

Perhaps there are too many docstring styles to enforce/output something more useful than the input in the bodies. Who knows whether someone is using Epydoc, Sphinx, Google or any other style.

So I think it's best for the "opinion" of Black to stick to the bare layout, like it is above (with an option to turn doc formatting off). And trailing whitespace could be trimmed without causing trouble AFAIK.

My preference would be one-liners and test() for the multi-line one.

bancek commented 5 years ago

I've implemented docstring indent fixing in my fork.

https://github.com/bancek/black/commit/0889797612d14d9dc5954df6a7de73076c7cb353

maxalbert commented 5 years ago

Agreed it would be great if black could enforce consistency of spacing around opening/closing quotes (I also agree with @TiemenSch that it's likely not a good idea to touch the contents of the docstrings themselves).

Just to note that imho readability should also be a consideration (if not the primary one?).

Personally I find docstrings the most readable if opening and closing quotes are on separate lines (as in test3) - even for single-line docstrings - because your eyes can immediately start parsing the text, rather than having to search for the end of the opening quotes first. But others' opinions may of course differ (and not sure whether there are any other considerations around editors and automated tools).

ndevenish commented 5 years ago

I've implemented docstring indent fixing in my fork.

bancek@0889797

Thanks for posting this, we just converted a large project from two-space indentation directly to black, and this stopped all the docstrings looking weird.

dimaqq commented 5 years ago

Cheers, @bancek this is exactly what annoyed me and why I came to report a bug :)

TL;DR it fixes the space in:

def foo():
    """First line.
[space]
    Second line.
    """

bancek's branch doesn't take opinion on where the opening/closing tripe-double-quote should be, or how long lines in the docstring could be.

dougthor42 commented 5 years ago

One thing that should be noted: the automatic formatting of docstrings needs to not break document converters (such as Sphinx) and their expected formats.

TiemenSch commented 5 years ago

Since Black promises to be uncompromising, I think the best bet for this feature is to have a docstring formatter package that is 'Black compatible' so to say. Black could probably expose some plugin functionality, but that would just be a stub then anyways?

ambv commented 4 years ago

Reformatting strings will be handled as part of #182, as solved by #1132. Blank line handling will be handled as part of #450.

Peque commented 4 years ago

@ambv Could you point out where the docstring issue is being or has been addressed? It seems those issues are about long strings and blank lines, not about docstring formatting.

ambv commented 4 years ago

You're right that what #182 describes is not enough.

ambv commented 4 years ago

Interesting bit from #745: format doctests in docstrings, too. It is an open question whether we can approach this and if yes, how. The rules for console-based Python are different. For example, we wouldn't format string reprs which use single quotes; there's >>> and ... prompts, and so on. But it's an interesting thing to decide.

@mattharrison even has a script that already uses Black for this purpose: https://gist.github.com/mattharrison/2a1a263597d80e99cf85e898b800ec32

Mattwmaster58 commented 4 years ago

I skimmed this whole issue and didn't seem to see any definitve decision on formatting docstrings. Has one been made? If not, could some sort of poll be made to check the general black-formatter population?

max-sixty commented 4 years ago

FYI this library from @keewis will format ~docstrings~ doctests in docstrings using black https://github.com/keewis/black-doctest

Peque commented 4 years ago

@max-sixty That seems to format code in doctest blocks only.

keewis commented 4 years ago

for a older, a little bit more mature tool, see https://github.com/asottile/blacken-docs. Instead of grouping lines and formatting expressions one at a time it uses regular expressions to parse and format continuous blocks of code. That means it might be faster, but also that adding doctest support is tricky.

I only found it after I had written my own tool which currently specializes in formatting doctest blocks (I might add support for other formats if I need them or if someone puts in a PR).

Note that neither will reformat docstrings, only code blocks in docstrings or documentation files.

Edit: black-doctest it has been moved to blackdoc

laike9m commented 4 years ago

Since long string support has been added, I'm looking forward to this one. Docstring support is the most needed feature for me.

rpkilby commented 4 years ago

I would extend the second rule to also have an opinion about opening quotes; I slightly prefer both opening and closing quotes on their own line for a multi-line docstring, but wouldn't object to either variant.

Google's docstring style places the subject on the same line as the opening quotes. Enforcing that the subject be on a separate line would make black incompatible with that docstring style.

dougthor42 commented 4 years ago

Enforcing that the subject be on a separate line would make black incompatible with [Google's] docstring style.

Is black already compatible with Google's general style guide? If not, then we shouldn't worry about that.

It seems like the general consensus is that opening and closing quotes should be on their own line for multi-line docstrings.

lithammer commented 4 years ago

It seems like the general consensus is that opening and closing quotes should be on their own line for multi-line docstrings.

I beg to differ. Even CPython itself is using this form (which I also prefer):

"""Lorem ipsum dolor sit amet."""

"""Lorem ipsum dolor sit amet.

Dignissim diam quis enim lobortis scelerisque fermentum dui. Eu turpis egestas
pretium aenean pharetra. Dictum non consectetur a erat nam at lectus.
"""
Peque commented 4 years ago

@lithammer CPython's docstrings are so ugly though... :stuck_out_tongue_winking_eye:

PEP-257 says both are correct for multi-line docstrings:

The summary line may be on the same line as the opening quotes or on the next line

See also how PEP-287 uses a different line for multi-line docstrings.

I personally prefer Google's style, but, even better, NumPy's. So I think this should be open for discussion.

Also, Black has already previously ignored "official" recommendations (like sticking to 79 characters), so who knows what this could end up being! :joy:

bfelbo commented 4 years ago

Have you considered the usage of docstrings when writing SQL queries and other long strings inside the code?

We use docstrings extensively for this purpose and would really prefer to have opening and closing quotes on their own line for multi-line docstrings. E.g.

"""
SELECT column1, column2
FROM table_name
WHERE cond1;
"""

is much preferred over

"""SELECT column1, column2
FROM table_name
WHERE cond1;
"""
Mattwmaster58 commented 4 years ago

Maybe there should be some sort of poll for this as many people are split? Thumbs up on this comment for quotes + subject on same line, ie:

"""subject
body
"""

And thumbs down for quotes + subject separated, ie:

"""
subject
body
"""
Eric-Arellano commented 4 years ago

With multiline strings that aren't docstrings, would \ make any difference in formatting? I usually use the style:

"""\
SELECT column1, column2
FROM table_name
WHERE cond1;
"""

If this is true, then you would have a mechanism for people to still have separated lines with multiline strings, but to still use the """Lorem ipsum dolor sit amet.""" style for docstring.

rpkilby commented 4 years ago

Per https://github.com/psf/black/issues/144#issuecomment-382552074, the second point was suggesting to enforce that closing quotes exist on their own line. Carl had separately suggested that opening quotes also be on their own line, and my point was that this would lock off at least one docstring style.

I don't think the general consensus has been to choose one or the other - I was just offering an argument against his stated minor preference. Ideally (in my mind), both would be viable styles. At best, black should enforce consistency but leave the heavy lifting to a docstring checker.

sersorrel commented 4 years ago

It's also probably worth thinking about whether this is solely about docstrings (as in, a string literal at the start of a function, whether triple-quoted or not), or about triple-quoted strings more generally – @bfelbo it seems unlikely that you're writing SQL queries in docstrings, for example.


def foo():
    """This is a docstring."""
    return textwrap.dedent("""This,
                              however,
                              isn't!""")
ndevenish commented 4 years ago

At best, black should enforce consistency but leave the heavy lifting to a docstring checker.

Right, I guess it depends how opinionated you want black to be over this issue? I'd expect it to handle things like

but anything more, and especially inner-structure, and especially wrapping feels a bit like overreach, and without a single decision-maker massively liable to descend into bikeshedding...

Excepting single-line docstrings, I tend to feel that inline/next-line looks good/bad depending on length and form of contents... maybe that is an argument for black to take that decision away.

bfelbo commented 4 years ago

It's also probably worth thinking about whether this is solely about docstrings (as in, a string literal at the start of a function, whether triple-quoted or not), or about triple-quoted strings more generally – @bfelbo it seems unlikely that you're writing SQL queries in docstrings, for example.

def foo():
    """This is a docstring."""
    return textwrap.dedent("""This,
                              however,
                              isn't!""")

True. We're writing them as triple-quoted strings, not docstrings. Nevertheless, it would be nice to have all triple-quoted strings follow the same style.

rpkilby commented 4 years ago

Right, I guess it depends how opinionated you want black to be over this issue?

Probably not much. Sphinx autodoc integration is powerful, and if black enforces a style that breaks your adopted docstring style, then this would require manually reformatting all your docstrings into a format compatible with these restrictions.

Nevertheless, it would be nice to have all triple-quoted strings follow the same style.

I would disagree with this. Docstrings and block strings serve two different purposes, and the former tend to have an explicit structure. e.g., with Google's docstring style, the first line is the summary, and the rest of the string the extended description.

"""Summary text.

Much longer description with arguments  and examples etc..
"""

In comparison, block strings can contain arbitrary content.

dougthor42 commented 4 years ago

[black modifying] anything more, and especially inner-structure, and especially wrapping feels a bit like overreach

if black enforces a style that breaks your adopted docstring style, then this would require manually reformatting all your docstrings into a format compatible with these restrictions.

100% agree. There are too many docstring parsers† for black to try to format docstring contents. The scope of what black does should be limited to things that do not break any of those parsers.

From what I can tell, the Sphinx parser for both NumPyDoc and Google styles (napoleon) accepts either form: "(summary on same line as opening quotes" / "summary on next line".

Thus I think it's safe to for black have an opinion on the "summary on same line" / "summary on next newline" debate. What that opinion is... well that's why we're talking :-)

† I intentionally did not say docstring styles. Neither NumPyDoc nor Google Style explicitly states that the summary must be on the same line as the opening quote††. Instead, they simply imply it via the examples. The examples in Google Style all have the summary on the same line as the quotes. The examples in NumPyDoc have instances of both.

†† If I'm wrong, please show me where it's stated - I couldn't find anything.


I tend to feel that inline/next-line looks good/bad depending on length and form of contents... maybe that is an argument for black to take that decision away.

Yes, black should definitely take away the choice.


We're writing [SQL queries] as triple-quoted strings, not docstrings. Nevertheless, it would be nice to have all triple-quoted strings follow the same style.

While I understand the wish, I think it's too dangerous to actually implement - formatting something other than a docstring can lead to broken programs.

bfelbo commented 4 years ago

We're writing [SQL queries] as triple-quoted strings, not docstrings. Nevertheless, it would be nice to have all triple-quoted strings follow the same style.

While I understand the wish, I think it's too dangerous to actually implement - formatting something other than a docstring can lead to broken programs.

Agreed. It definitely makes sense for black to only format docstrings.

It would, however, still be great for black to format docstrings in the same way that many developers format their other triple-quoted strings for e.g. SQL queries. At least based on my personal experience, such triple-quoted strings usually have opening and closing quotes on their own line.

dimaqq commented 4 years ago

https://www.python.org/dev/peps/pep-0257/ explicitly talks about docstring processing:

Once trimmed, these docstrings are equivalent:

def foo():
    """A multi-line
    docstring.
    """

def bar():
    """
    A multi-line
    docstring.
    """

Further thoughts: black is opinionated, there should not be any flags or arguments.

On the same note, I don't think black needs to align itself to any other style, whether opinionated (which to choose?) or parametrised (what flags?).

@rpkilby re: Google's style: who uses it? Does your Google internal codebase still use 2-space indentation instead of 4? Is the public style only for open-source projects?

My 2c: let @ambv decide and let's go boldly on!