cfpb / back-end

:warning: THIS REPO IS DEPRECATED :warning: – Please visit:
https://github.com/cfpb/development
Creative Commons Zero v1.0 Universal
3 stars 3 forks source link

Python Style Guide #14

Open willbarton opened 8 years ago

willbarton commented 8 years ago

In the Dev COP discussion last week, cross-team code review was brought up as a nice-to-have, but often times non-team members don't necessarily know where it's appropriate to jump in. I think we can improve that one the back-end (and break down some of the barriers between our teams) with a Python style guide. This is also something that's come up in other conversations I've had over the past few weeks, and in a couple of code reviews I've participated in.

I know we generally try to follow PEP8, I know, but there are some idioms and conventions we might want to adopt for our own use that aren't covered by PEP8. Plus it's always nice to document the rationale we might have, as opposed to simply "because PEP8" — it opens it up to fully-informed reconsideration later.

An example that's more verbose than I'd anticipate us creating is Google's Python Style Guide. What's nice about Google's is it breaks down the rules into their pros and cons and decision-making.

There are three main concerns I'd have a style guide address, consistency, continuity, and readability.

I think this would also permit some more cross-team code reviews, because if we have a consistent style as a discipline, it means we can focus on more of the interesting bits of code, and less on variable naming, use of list comprehensions, that may vary from team to team, and than one of us from a different team might not be aware of.

I'm interested in what everyone thinks, if you have any favorite resources for Python style, or if you have strong feelings one way or the other about conventions or idioms we use, should be using, should not be using, etc. I would like to spend most of our meeting today discussing this.

Here are some resources I find helpful:

Ultimately consistency and continuity are the core of what this whole effort to create back-end standards is all about. To get that effort moving again, I want to go back down and start at the bottom, the way we're writing our code, and then we can go from there.

I also realize not all of us write Python, and perhaps as part of this we can come up with some language-agnostic conventions that would help across languages. I'd welcome the input of those of you working on non-Python projects here.

chosak commented 8 years ago

This is a great relevant talk by Raymond Hettinger from PyCon 2015:

https://www.youtube.com/watch?v=wf-BqAjZb8M

He shows lots of good examples of how code can technically comply with PEP8 but not be as "Pythonic" as possible.

willbarton commented 8 years ago

I love that his framing boils down to "don't commit atrocities".

higs4281 commented 8 years ago

I missed the gorilla

On Wed, Mar 9, 2016 at 11:44 AM, Will Barton notifications@github.com wrote:

I love that his framing boils down to "don't commit atrocities".

— Reply to this email directly or view it on GitHub https://github.com/cfpb/back-end/issues/14#issuecomment-194389739.

willbarton commented 8 years ago

I think there are a few basic steps to getting this underway:

  1. We need to assemble any examples of Python style that we currently try to check for/enforce.
    • eRegs uses flake8 on a couple of repositories. What checks are disabled?
    • There's a Django linter that invokes pylint. What checks are disabled there? (Pointed out by @higs4281)
    • What are some examples of that we would consider to be Good Code™ here at CFPB (Python, Scala, or any language, really)? (@chosak)
    • Does Clousseu have any checks that would be relevant from a stylistic point of view?
  2. With that knowledge, what are some of the idioms, conventions, patterns, etc we want to encourage and/or enforce? Which ones are covered by some existing document (PEP8, etc) that we can link to?
  3. We need to provide some level of automation for checking and authoring (excellent idea from @chosak).
    • Can we provide flake8 extensions for our rules and recommendations?
    • Can we provide editor configuration for vim, Sublime, etc, that will help author code in our prefered style?

@hillaryj has some experience creating a Python style guide.

@cfpb/back-end-team-admin, does anyone have anything else I'm missing?

So, I think the next step is to figure out some of the answers to the questions in 1. That will inform how we proceed.

willbarton commented 8 years ago

Also for reference, the front-end team's JavaScript Guide.

willbarton commented 8 years ago

I've been thinking a lot about this the past few weeks, and I think we need to start small. Let's start with editor configs, so we are all on the same page and we can generate some working artifacts from this effort right away. Then we can scale up to particular idioms/conventions in code we want to adhere to.

I'm going to propose that we follow PEP8 on identation and spaces-over-tabs. For maximum line numbers, I think we need something more sane than 79 chars. Does anyone have any particular favorite line lengths?

This gets us started with something, and we can continue to survey our existing landscape for particular examples of good style, style we want to discourage, etc.

richaagarwal commented 8 years ago

By more sane do you mean greater than or less than 79 chars? I think anything between 60 and 75 is good. But I also think this shouldn't apply to characters from indentation. I feel like the main benefit is readability, not really the reasons listed here ("Limiting the required editor window width makes it possible to have several files open side-by-side, and works well when using code review tools that present the two versions in adjacent columns") since we all have displays anyway.

chosak commented 8 years ago

I also prefer narrow lines (79 max) for readability's sake. Even though longer lines work on external displays, they're a real pain when you do disconnect and work directly from the laptop. 79 makes it easy to have two files open side by side on a normal laptop screen.

hillaryj commented 8 years ago

Why I prefer longer allowable char lengths is, for example, cases where you have a long string. I hate breaking up strings artificially just to hew to a line length, especially because indentation can easily chew up a lot of your line space in python. Probably 98% of my code is 80 characters or less, but my longest lines are almost exclusively strings with 12-20 characters of indentation and are typically up to 100 characters wide.

I support making a general line length guideline but also allowing for exceptions in cases that prevent, for example, chopping and hacking 12 characters off the end of a string just so it fits the line length.

richaagarwal commented 8 years ago

@hillaryj Yeah, as I mentioned in my comment, I think this limit shouldn't apply to indentation. I'm not sure if there is a tool that does that already, but that would be ideal.

willbarton commented 8 years ago

I think it's something we have to deal with — you could probably construct some custom text-width rules for vim that wouldn't take into account indentation, but it would likely be fairly fragile.

I have one other concerns though. Any maximum line width is intended to increase readability of the file by avoiding any weird text-wrapping that may happen as a result of not setting a line width or (if wrapping is disabled) side-scrolling. The width of my editor window doesn't change with each line, so if I have two lines, one with greater indentation but each with 80 chars of actual content, one will still either require wrapping or scrolling.

I think the width of windows (and ultimately screens when we're comparing files side-by-side) is going to dictate a maximum number of characters, spaces included, that can realistically be displayed on a line.

willbarton commented 8 years ago

I'd add, I have spent years with vim configured at 72 characters, so that's my personal normal.

I would also strongly encourage anyone to break whatever line-width rules we might establish before doing something obscene to a string or anything else. Readability is the whole point.

ascott1 commented 8 years ago

The front-end team uses 80 characters. Since hopping between front-end and back-end code isn't that unusual, it may be worth keeping them consistent (even if that means changing what the front-end team uses).

willbarton commented 8 years ago

I think consistency across all our development would be an excellent reason to choose 80.

rosskarchner commented 8 years ago

I like the idea of cross-team consistency, but not really all that opinionated otherwise. This seems like one of those things where it's more valuable to make a decision and stick with it, than worry too much about making the right decision.

willbarton commented 8 years ago

Alright, so let's adopt the front-end's line widths for consistency.

Next, I'd like to use line-widths and spaces as a starting point for editor configurations. The files can live in this repo (perhaps under editors/). I can start a CFPB vim config, but it'd be good to have others volunteer to start configs for their editors of choice (I know we have some Sublime users, not sure what else).

That gives us a starting point for other common config values we want to share across the discipline.

willbarton commented 8 years ago

Coming out of discussion in GHE/CFGOV/platform/issues/183, we have the following which can be made into an alias or added as a pre-commit hook to run flake8 only on files that have been changed in git:

flake8 $(git status -s | grep -E '\.py$' | cut -c 4-)