Consistent whitespace between functions and classes

daniel-klein commented 10 months ago

Starsim code is inconsistent with whitespace following function. The styleguide does not currently offer advice. Is there a preference for one or two lines of whitespace to separate subsequent functions? Two lines of whitespace before a class definition or other major structure?

daniel-klein commented 10 months ago

Here's the PEP8 standard: https://peps.python.org/pep-0008/#blank-lines

daniel-klein commented 10 months ago

I recommend we just follow PEP8:

Surround top-level function and class definitions with two blank lines.
Method definitions inside a class are surrounded by a single blank line.

cliffckerr commented 10 months ago

The style guide does cover this point :grin: https://github.com/amath-idm/styleguide#35-blank-lines-gsg35

In my view, the style guide is consistent with the spirit of PEP8 (emphasis mine):

Extra blank lines may be used (sparingly) to separate groups of related functions. Blank lines may be omitted between a bunch of related one-liners (e.g. a set of dummy implementations).

Use blank lines in functions, sparingly, to indicate logical sections.

In a typical Python library (like requests), the vast majority of methods (around 80% from a quick scan) are only a few lines, so you can typically see multiple methods on the screen at once. Within methods, blank lines are indeed used "sparingly". Here, it would of course be overkill to use two blank lines between methods.

For Starsim, because we're manipulating higher-level things in more complex ways, we do not typically use blank lines within functions or methods "sparingly", and it's rare for multiple methods to fit within one screenful. One way I've seen this solved in other higher-level libraries like scikit-learn is to move long (100+ line) blocks of code to top-level functions (see e.g. here), and then just reference those functions from within short class methods.

I think that's a valid approach, though I personally prefer our current one. What I don't usually see in "gold standard" Python libraries is 100+-line methods with lots of blank lines within them, and then only separated by a single blank line. So I think we could keep our current long methods but with spacing as if they're top-level functions, or we could be more disciplined about keeping our methods very short (aiming for <20 lines, which plus a good docstring is about a screenful) and then only use a single line between them. What I wouldn't be in favor of keeping long methods but reducing the space between; it makes it much harder to visually parse where one method ends and the next one begins.

I agree that shorter methods are probably better in an ideal sense, but I feel all of us (@RomeshA and @pausz probably being exceptions :D) have been using this less-CSish, more-researchish style for a long time, and I think it'll be hard to switch. I'm not necessarily opposed to trying, but I think it would be a lot of work, will probably be applied inconsistently, and am not personally convinced the payoff is worth it! (I'd like to give a counterexample to this argument though, which is that when @RomeshA split up the very long cv.BaseVaccination.apply() method into multiple subcomponents, that significantly increased usability and readability. So it's definitely something we should at least be mindful of!)

cliffckerr commented 5 months ago

Closing for now.

starsimhub / styleguide

Consistent whitespace between functions and classes #9