python / cpython

The Python programming language
https://www.python.org
Other
62.68k stars 30.06k forks source link

Update PEP 8 to encourage modern conventions #62672

Closed ncoghlan closed 11 years ago

ncoghlan commented 11 years ago
BPO 18472
Nosy @malemburg, @gvanrossum, @warsaw, @rhettinger, @terryjreedy, @ncoghlan, @ezio-melotti, @bitdancer, @voidspace, @florentx, @serhiy-storchaka
Files
  • issue18472_pep_8_update4.diff: General modernisation of PEP 8
  • issue18472_pep_8_update5.diff: Version to be committed to the peps repo
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/ncoghlan' closed_at = created_at = labels = ['type-feature'] title = 'Update PEP 8 to encourage modern conventions' updated_at = user = 'https://github.com/ncoghlan' ``` bugs.python.org fields: ```python activity = actor = 'flox' assignee = 'ncoghlan' closed = True closed_date = closer = 'ncoghlan' components = [] creation = creator = 'ncoghlan' dependencies = [] files = ['31050', '31109'] hgrepos = [] issue_num = 18472 keywords = ['patch'] message_count = 39.0 messages = ['193159', '193162', '193163', '193164', '193195', '193201', '193204', '193778', '193780', '193782', '193785', '193808', '193809', '193834', '193838', '193839', '193842', '193843', '193847', '193905', '193906', '193907', '193908', '193909', '194054', '194056', '194063', '194070', '194078', '194086', '194089', '194099', '194144', '194171', '194173', '194178', '194487', '194505', '194506'] nosy_count = 12.0 nosy_names = ['lemburg', 'gvanrossum', 'barry', 'rhettinger', 'terry.reedy', 'ncoghlan', 'ezio.melotti', 'r.david.murray', 'michael.foord', 'flox', 'tshepang', 'serhiy.storchaka'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue18472' versions = [] ```

    ncoghlan commented 11 years ago

    A home for a couple of proposed PEP-8 updates (as per the thread starting at http://mail.python.org/pipermail/python-dev/2013-July/127284.html)

    ncoghlan commented 11 years ago

    Attached diff has 5 changes:

    Some other oddities I noticed:

    ncoghlan commented 11 years ago

    Thomas Wouters made a couple of good points regarding absolute vs explicit relative imports, so I've improved the rationale in the PEP accordingly.

    ncoghlan commented 11 years ago

    Tweaked the wording in the introduction to cover removal of obsolete conventions along with adding new ones.

    terryjreedy commented 11 years ago

    Is the scope of this issue just what is in the title, the whole PEP, or something in between;-?

    For instance, Guido once approved (on pydev) adding an admonition to Programming Recommendations something like the following.

    The entry could point out that a separate statement that binds the function to a name negates the two reasons that justify using lambda, but that is probably unnecessary.

    This is from a discussion at least 2 years ago, perhaps 3 or more, before I would have been comfortable or perhaps even eligible to push a patch, and no one else picked up on it either.

    ncoghlan commented 11 years ago

    Sure, I think it makes sense to put that in. I'll be running the whole patch by python-dev, enumerating the reasons for each change before I commit anything.

    terryjreedy commented 11 years ago

    The public versus internal part of this issue is related to bpo-10434 Document the rules for "public names", which contains among other things a suggested rewrite for PEP-8.

    ncoghlan commented 11 years ago

    Latest version is a more general cleanup patch for PEP-8 (hence the change in the issue title). Enumerating all the changes/additions:

    1. Added an intro paragraph that makes it clear this is a living document, not "set and forget".

    2. Added a couple more explicit reasons to the "foolish consistency" section, and tweaked the header for that list to help make it clearer that there may be other good reasons, these are just especially good ones.

    3. Updated the tabs and spaces section for Python 3

    4. Changed the rationale for the line length limit. These days, it is about side-by-side editor windows, diff tools, and online text editing widgets, not ancient terminals with limited column widths.

    5. Cleaned up the encoding cookie guidelines for Python 3

    6. Changed the absolute vs explicit relative import discussion to state that while absolute imports are recommended, there are valid reasons to use explicit relative imports. Added an explicit admonition that the standard library should always absolute imports, and that *nobody* should use implicit relative imports.

    7. Added a guideline to avoid wildcard imports

    8. The new section on public vs internal interfaces

    9. Clarified that it's not just Jython that omits the += hack, and that it is fairly easy to break it in CPython, too.

    10. Added a guideline about using def rather than assigning a lambda to a name.

    11. Rewrote the class based exception guideline to account for the requirement that all exceptions inherit from BaseException, as well as the exception hierarchy design lessons we have learned that led to the creation of PEP-3151 (which explicitly *rejects* the per-module generic exception approach previously recommended by PEP-8)

    12. Clarified what a "non-error exception" might be used for.

    13. Added a guideline about exception chaining

    14. For the exception raising guideline, moved Python 3 compatibility up as the main motivating factor

    15. Added a guideline about using the "as" clause to bind a caught exception to a name

    16. Added a guideline explicitly favouring PEP-3151 based exception handling over errno introspection

    17. Added a note about the Python 3 text model changes to the isinstance guideline

    18. Added a recommendation that third party experiments with function annotations combine them with a decorator that indicates how they're to be interpreted

    ncoghlan commented 11 years ago

    Guido, if you have time to review these proposed PEP-8 changes, that would be great. Most should be unobjectionable, but I seem to recall you disliking the recommendation for third party function annotation experiments to use an explicit decorator, so I'm happy to drop that one if you don't like it.

    I'm also willing to send it back to python-dev at large for further feedback, but I'm not sure that's necessary (the two big changes are the ones about public APIs and the import recommendations, and those have had discussion threads already)

    malemburg commented 11 years ago
    1. Added an intro paragraph that makes it clear this is a living document, not "set and forget".

    If we want to make this a living document, I think the PEP will have to receive its own internal version number and a history section at the end explaining the changes (basically what you just summarized for the patch).

    Doing so is important, since companies and tools often reference PEP-8 in internal coding guidelines and any changes need to be easily identifiable for readers of those guidelines.

    gvanrossum commented 11 years ago

    This is mostly fine.

    FWIW I disagree with MAL's assertion that we need to start adding internal versioning; people who "lawyer" about which version of the PEP should apply are focused on the wrong thing entirely. IIUC the occasional Python-3-specific rule is already flagged as such. Also for real nitpickers there's the Hg revision. :-)

    My nits on the diff:

    rhettinger commented 11 years ago

    +1 Perhaps also mention that Python 3 now raises TabError for inconsistent tabs and spaces.

    +1 I've seen too many atrocities committed by people trying to scrunch code into 80 columns (weird line splits, over-abbreviated variable names, etc).

    +1 for UTF-8 as the default preference for everything in the Python world.

    +1

    I'm afraid even a mild admonition would lead of over zealous rejection of legitimate use cases for exceptions.

    For uses cases that actually need non-local control flow, there are worse things you can do than use exceptions (i.e. returning an error token up a chain of function calls requiring if-error logic at every step along the way).

    In particular, exceptions for control-flow are a good solution to some challenges that arise in larger programs (i.e. problems encountered in low level logic can only be handled by high level user interface logic, and the intermediate level business logic serves only as a pass through).

    It seems to me that the issues with StopIteration were mostly caused by having multiple control conditions all sharing the same exception (i.e. a function inside a for-loop can't use StopIteration to control an outer loop; the latter would need its own distinction exception).

    P.S. For those who are interested, the book "The Little MLer" does a great job showing how exceptions for control-flow can be used simplify recursive functions.

    ncoghlan commented 11 years ago

    The exception one is near-and-dear to my heart at the moment, as we're in the process of refactoring a large app that currently does various checks in the UI layer so it can present nice errors, when it should really be leaving those checks to the business logic layer and then throwing app specific exceptions that the UI layer understands and can present to the users.

    As Raymond noted, I think we need to be very cautious when it comes to exception handling, especially with Go advocates propagating all sorts of foolishness about the evils of exceptions and the notion that return codes are somehow now a superior approach :P

    However, I'm not sure we have a guideline about using deterministic resource management to better cope with unexpected exceptions - I'll look into that.

    terryjreedy commented 11 years ago

    Based on recent python-list posts, there is also 'break is bad' propaganda going around. Perhaps we should say that loop-and-a-half with break is preferred (or at least equally acceptable) to repeating code.

    while True:
        value = getvalue()
        if not value:
            break
        process(value)

    versus

    value = getvalue()
    while value
        process(value)
        value = getvalue()

    The latter has the problem that the getvalue code can get out of synch, and indeed, given the alternative of writing it once, it is not obvious that it should be the same in the two places.

    serhiy-storchaka commented 11 years ago

    Based on recent python-list posts, there is also 'break is bad' propaganda going around. Perhaps we should say that loop-and-a-half with break is preferred (or at least equally acceptable) to repeating code.

    Sometimes yet one alternative can be used instead:

    for value in iter(getvalue, stopvalue):
        process(value)
    gvanrossum commented 11 years ago

    OK, forget I said anything about exceptions. The language in Nick's patch is fine.

    Regarding "break is bad", while I agree with Terry's recommendation, I think saying anything about this would be overreaching the scope of a style guide. We shouldn't give style lawyers more ammo to start holy wars.

    I'm not sure I follow Nick's reference to "deterministic resource management to better cope with unexpected exceptions" -- is that an argument from the Go crowd? (Yawn.)

    ncoghlan commented 11 years ago

    "with statements are good", basically.

    rhettinger commented 11 years ago

    Nick, +1 from me if you want to make your edits and wrap this one up.

    gvanrossum commented 11 years ago

    Raymond Hettinger added the comment:

    Nick, +1 from me if you want to make your edits and wrap this one up.

    +1 from me too.

    warsaw commented 11 years ago

    Comments on the diff.

    -Two good reasons to break a particular rule: +Some especially good reasons to ignore a particular guideline:

    It's probably enough to just s/Two/Some/ - not sure the 'especially' emphasis is really needed.

    +3. Because the code in question predates the introduction of the rule and

    • there is no other reason to be modifying that code (especially if
    • conforming to the updated style guide risks breaking backwards
    • compatibility)

    I'd rather s/rules/recommendation/.

    Also, I'm not sure the parenthetical comment is necessary, but if you disagree, perhaps remove the parentheses and rephrase more assertively: "...that code. Don't break backward compatibility just to conform to the recommendations in this PEP."

    -second-most popular way is with tabs only. Code indented with a -mixture of tabs and spaces should be converted to using spaces -exclusively. When invoking the Python command line interpreter with +second-most popular way is with tabs only. In Python 3, these are the +only two permitted options (mixing tabs and spaces in the same file will +raise an error).

    I don't think the parenthetical comment adds much. They'll find this out soon enough.

    + +Code indented with a mixture of tabs and spaces should be converted to +using spaces exclusively.

    Perhaps more assertively (but perhaps controversial): "Spaces-only indentation is recommended."

    -There are still many devices around that are limited to 80 character -lines; plus, limiting windows to 80 characters makes it possible to -have several windows side-by-side. The default wrapping on such -devices disrupts the visual structure of the code, making it more -difficult to understand. Therefore, please limit all lines to a -maximum of 79 characters. For flowing long blocks of text (docstrings -or comments), limiting the length to 72 characters is recommended. +Limiting the required editor window width to 80 characters makes it possible +to have several files open side-by-side, and works well when using code +review tools that present the two versions in adjacent columns. + +The default wrapping in most tools disrupts the visual structure of the +code, making it more difficult to understand. Some web based tools may not +offer line wrapping at all. Therefore, please limit all lines to a maximum +of 79 characters. For flowing long blocks of text (docstrings or comments), +limiting the length to 72 characters is recommended.

    Because I've fielded some questions about this over the years, perhaps add a footnote explaining why 79 characters is recommended instead of 80. Even though the window might be 80 characters wide, some editors (e.g. Emacs) place a glyph in column 80 instead of the 80th character, so the text actually wraps after the 79th character.

    -- Relative imports for intra-package imports are highly discouraged.

    • Always use the absolute package path for all imports. Even now that
    • PEP-328 is fully implemented in Python 2.5, its style of explicit
    • relative imports is actively discouraged; absolute imports are more
    • portable and usually more readable. +- Absolute imports are recommended, as they are usually more readable and
    • tend to be better behaved (or at least give better error messages) if the
    • mport system is incorrectly configured (such as when a directory inside a

    s/mport/import/

    +- Wildcard imports (from <module> import *) should be avoided, as they

    • make it unclear which names are present in the namespace, confusing both
    • readers and many automated tools. There is one defensible use case for
    • a wildcard import, which is to republish an internal interface as part
    • of a public API (for example, overwriting a pure Python implementation of
    • an interface with the definitions from an optional accelerator module).

    Although I'll note that even in this case, it can be preferable to explicitly name the symbols being republished. import-* is useful if that list is too difficult to manage, e.g. it changes often, is dynamically created, or is just huge. Perhaps this should also reiterate the recommendation that intentionally republished symbols should be named in the masked module's __all__.

    [section on public and internal interfaces]

    Nicely written!

    • or a = a + b. Those statements run more slowly in Jython. In
    • performance sensitive parts of the library, the ''.join() form
    • should be used instead. This will ensure that concatenation occurs
    • in linear time across various implementations.
    • or a = a + b. This optimisation is fragile even in CPython (it only

    s/optimisation/optimization/ but I guess we don't need to argue about that particular bikeshed colour :).

    • The first form means that the name of the resulting function object is
    • specifically 'f' instead of the generic '\<lambda>'. This is more useful
    • for tracebacks and string representations in general. The use of the
    • assignment statement eliminates the sole benefit a lambda expression can
    • offer over an explicit def statement (i.e. that it can be embedded inside
    • a larger expression)

    I'm concerned that the PEP is getting too verbose with all the Talmudic detours. The more verbose the PEP gets, the less useful it is as a reference document. OTOH, explaining why the recommendations exist helps people to understand that it's not just a collection of arbitrary rules. Two thoughts come to mind as a possible solution:

    • Design exception hierarchies based on the distinctions that code
    • *catching* the exceptions is likely to need, rather than the locations
    • where the exceptions are raised. Aim to answer the question
    • "What went wrong?" programmatically, rather than only stating that
    • "A problem occurred" (see PEP-3151 for an example of this lesson being
    • learned for the builtin exception hierarchy)

    While I agree, this recommendation feels a little squishy to me. This is a perfect example of discussion I'd like to see removed from the main body of the PEP, but could be appropriate for an annotation or footnote.

    • It is highly recommended that third party experimants with annotations
    • use an associated decorator to indicate how the annotation should be
    • intepreted.

    s/experimants/experiments s/intepreted/interpreted

    warsaw commented 11 years ago

    On Jul 27, 2013, at 03:13 PM, Guido van Rossum wrote:

    • I think we should recommend against tabs outright. They are getting more
    • and more misunderstood.

    +1

    • Regarding line length, I think it is reasonable to mention that many
    • organizations are settling on 100 as a compromise. On newer laptops you
    • can still fit two terminal windows (with a reasonable font size) side by
    • side. (Also many people checking code into the stdlib ignore the 80 char
    • limit. :-( )

    Really? I haven't seen much 100 character limits. Have you seen this at Google or Dropbox?

    As for the stdlib checkins > 80 chars, yeah, that really annoys me!

    • Which reminds me. Do we have a recommendation yet to write except Exception: instead of except: If not, we should.

    PEP-8 current says this:

    A bare ``except:`` clause will catch SystemExit and KeyboardInterrupt exceptions, making it harder to interrupt a program with Control-C, and can disguise other problems. If you want to catch all exceptions that signal program errors, use ``except Exception:`` (bare except is equivalent to ``except BaseException:``).

    • I wouldn't mind seeing at least a mild recommendation against using exceptions for non-local control flow. The lessons of StopIteration suggest that such designs are often fraught with subtle problems.

    As with everything (and which IMHO does not need to be stated again), there are sometimes good reasons to stray from the recommendations in the PEP. So in general +1 for such a mild con recommendation. Those who understand why it's generally a bad idea will also understand when it's proper to stray.

    (The only argument against such things is of course the PEP-8 tool, which codifies the recommendations in this PEP. Some packages actually run the PEP-8 checker as part of their test suite, so if we change a recommendation and that gets pulled into PEP-8, then tests can break.)

    warsaw commented 11 years ago

    On Jul 28, 2013, at 04:59 AM, Raymond Hettinger wrote:

    • Regarding line length, I think it is reasonable to mention that many organizations are settling on 100 as a compromise. On newer laptops you can still fit two terminal windows (with a reasonable font size) side by side

    +1 I've seen too many atrocities committed by people trying to scrunch code into 80 columns (weird line splits, over-abbreviated variable names, etc).

    That's unfortunately. "Weird" is almost never needed. Long lines are a code smell IMHO.

    warsaw commented 11 years ago

    On Jul 28, 2013, at 09:48 PM, Nick Coghlan wrote:

    Nick Coghlan added the comment:

    "with statements are good", basically.

    with ExitStack() as stack is even better :)

    gvanrossum commented 11 years ago

    [Guido]

    >- Regarding line length, I think it is reasonable to mention that many >- organizations are settling on 100 as a compromise. On newer laptops you >- can still fit two terminal windows (with a reasonable font size) side by >- side. (Also many people checking code into the stdlib ignore the 80 char >- limit. :-( )

    [Barry]

    Really? I haven't seen much 100 character limits. Have you seen this at Google or Dropbox?

    At Google people argue to raise the 80 char limit forever -- and they use 2-space indents. For Java they've accepted 100 chars. At Dropbox they also use 100 chars (although nobody enforces style rules here, so many lines are even longer :-( ).

    (The only argument against such things is of course the PEP-8 tool, which codifies the recommendations in this PEP. Some packages actually run the PEP-8 checker as part of their test suite, so if we change a recommendation and that gets pulled into PEP-8, then tests can break.)

    I hate that tool. It only covers a small part of the PEP, and is very nitpicky about the part it does support, causing people to contort their code to shut up the tool instead of learning and understanding the PEP and applying common sense.

    ncoghlan commented 11 years ago

    Changes in version 5:

    I *didn't make any changes in relation to Barry's comment about having the commentary intermixed with the guidelines. I quite like the notion of stripping PEP-8 down to just the essentials and having PEP-108 as "The annotated PEP-8", but that's a bigger project than I'm prepared to tackle (heck, even the *current patch turned out to be a far more substantial update than I expected!).

    I'll commit this version - feel free to tweak further in the PEP repo if you spot any mistakes :)

    I deliberately left the following point out since Guido said "out of scope" above (I wrote it before noticing that):

        def itervalues(x):
            yield x.getvalue()
    
        for value in itervalues(obj):
            process(value)

    Yes::

        while True:
            value = obj.getvalue()
            if not value:
                break
            process(value)

    No::

        value = obj.getvalue()
        while value:
            process(value)
            value = obj.getvalue()
    ncoghlan commented 11 years ago

    For the record, the thread where Thomas Wouters provided the feedback that led to the current wording regarding imports: http://mail.python.org/pipermail/python-dev/2013-July/127373.html

    ncoghlan commented 11 years ago

    Heh, I just realised a correct generator definition would have needed the loop-and-a-half internally anyway:

        def itervalues(x):
            while True:
              value = x.getvalue()
              if not value:
                  break
              yield value
    warsaw commented 11 years ago

    On Aug 01, 2013, at 12:19 PM, Nick Coghlan wrote:

    I *didn't make any changes in relation to Barry's comment about having the commentary intermixed with the guidelines. I quite like the notion of stripping PEP-8 down to just the essentials and having PEP-108 as "The annotated PEP-8", but that's a bigger project than I'm prepared to tackle (heck, even the *current patch turned out to be a far more substantial update than I expected!).

    You've done great work on a substantial undertaking! Thanks. :)

    gvanrossum commented 11 years ago

    Thanks a bundle, Nick!

    gvanrossum commented 11 years ago

    I think the current wording about line length makes it too easy to adopt 99 instead of 79. I propose to go back to 79 only, and add this before the paragraph starting with "The preferred way of wrapping long lines ..."

    """ Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line nominal line length from 80 to 100 characters (effectively increasing the maximum length to 99 characters), provided that comments and docstrings are still wrapped at 72 characters. """

    gvanrossum commented 11 years ago

    See proposed patch at https://codereview.appspot.com/12269044/ .

    ccda776a-4736-4dbd-bf35-88e58f6952ba commented 11 years ago

    @Guido am glad for the patch; it's a nice compromise

    gvanrossum commented 11 years ago

    Patch applied.

    ncoghlan commented 11 years ago

    I like the new wording, but was there a missing "hg push" here? (Not seeing the update on hg.python.org/peps)

    ncoghlan commented 11 years ago

    I applied Guido's update in http://hg.python.org/peps/rev/82e24ac40255

    I'm not sure what happened with Guido's "Patch applied" above, but I wanted to include Terry's encoding fixes from bpo-18628, so I figured it made sense to ensure this was included first.

    gvanrossum commented 11 years ago

    Sorry, I forgot the push. I had to do a little merge dance and also fixed one typo (a redundant 'line') and now it's hopefully pushed.

    ezio-melotti commented 11 years ago

    I'm a bit late but I still have a few comments:

    + The paren-using form also means that when the exception arguments are + long or include string formatting, you don't need to use line + continuation characters thanks to the containing parentheses.

    This paragraph doesn't add much and could be removed IMHO.

    +- When binding caught exceptions to a name, prefer the explicit name + binding syntax added in Python 2.6:: + + try: + process_data() + except Exception as exc: + raise DataProcessingFailedError(str(exc))

    It took me a bit to realize that this is talking about "as". I think it would be better to be more explicit, and simplify the example a bit so that it's not as distracting.

    Is there any specific reason to use "sequence of integers" instead of "sequence of bytes"?

    terryjreedy commented 11 years ago

    Ezio 2. Explicitly say "use 'as' instead of ','". Someone who does not know that 'as' is available since 2.6 may not understand.

    1. 'byte' may be interpreted as 'char', especially for someone using 2.x 'bytes'.
    bitdancer commented 11 years ago

    Technically a bytes object is sequence of integers, not a sequence of bytes. That is, if you iterate it, you get integers. Python doesn't have a 'byte' type.