openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
769 stars 494 forks source link

coding style for Python #2002

Open shimwell opened 2 years ago

shimwell commented 2 years ago

Could we have a coding style for Python using in OpenMC such as Black?

We have a clang coding style for the C++ part of the code and a .clang_format as described in the docs here and here so this would be a similar situation for the Python side.

Currently a flake8 on the develop branch scan picks up 2635 pep8 typos

flake8 . --extend-exclude=vendor --show-source --statistics  --count

This could be brought down a bit if we run a code formatter on the code like this

black . --exclude=vendor

I notice sometimes we use " " and other times we use ' ' quotes so we could use the --skip-string-normalization to avoid changing the string quotes if preferred?

Then we could also add a github action to automatically keep the code formatted nicely in the future. This one will format the code and commit back to the feature branch and skip the CI (to avoid running it twice)

name: black

on:
  push:
    paths:
      - '**.py'

defaults:
  run:
    shell: bash

jobs:
  black:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          ref: ${{ github.head_ref }}
      - name: Setup Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.x
      - name: Install black
        run: |
          python -m pip install --upgrade pip
          pip install black
      - name: Run black
        run: |
          black . --exclude vendor
      - uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: "[skip ci] Apply formatting changes"

And finally we could add a pre-commit file name.pre-commit-config.yaml to allow developers to automatically format prior to committing

repos:
  - repo: https://github.com/psf/black
    rev: 22.1.0
    exclude: vendor
    hooks:
      - id: black

There is at least one part of the code that I would refactor as a comment before applying these sweeping changes as in the assembly example we use white space in the code to indicate location of fuel pins. I would make a comment with the pin layout and then let black format the code.

I would be happy to put in PR(s) for these changes or part of these changes if they are welcome?

gridley commented 2 years ago

Does strict adherence to PEP really impact readability that much? In my opinion, the names of variables, code comments, and code architecture influence code readability maybe an order of magnitude more. With C++ it's a bit difference since you have complete freedom to do wacko stuff with whitespace.

On the other hand, there are some basic aesthetic and practical consistency aspects such as using spaces rather than tabs, with four spaces per "tab", not putting spaces between the equal sign and kwargs, and not grossly exceeding 80 characters per line. These are things that stick out like a sore thumb if not adhered to in a PR. Other aspects of PEP such as the rules around how to indent multi-line arguments passed into a function are completely unnecessary and impact readability in no way IMO.

Just my $0.02!

shimwell commented 2 years ago

Yep, I agree that the code is nice and readable at the moment and that automating PEP on the code might not improve the readability significantly.

Black would do those basic aesthetic aspects that you mention such has conversion of tabs to spaces, remove spaces between equals signs and kwargs and can wrap lines above a specified length.

For me the important part is that all this would be done automatically without the reviewer having to waste time doing these corrections.

akoen commented 1 year ago

Agree with @shimwell here, black is the standard python style and drastically simplifies decisions about formatting.

shimwell commented 1 year ago

We implemented something similar on PyNe a while back which might be relevant because in that case we formatted a large amount of code.

First we made and merged a single PR that formatted the entire code base https://github.com/pyne/pyne/pull/1436

The we made a PR that helped keep the code formatted consistently for future PRs https://github.com/pyne/pyne/pull/1435 This 2nd PyNe PR has not actually been merged so I think we should take a different approach if we done something similar on openmc. We could run the format checker as a CI test much like we have a c++ format checker in the current CI suit.

One other change we could do this file by file and then protect each file that has been formatted with the CI tests instead of all at once. The first PyNe PR was 22k of line changes and this was perhaps a bit big.

I can make this PR just to show more clearly

shimwell commented 1 year ago

I have made an example PR with a python-format-checking CI that checks a file is well formatted.

The CI originally fails, then once the file is formatted with black then the CI passes.

https://github.com/openmc-dev/openmc/pull/2737

this allows us to bring files under the formatting rule one file at a time but perhaps is this overly cautious

what do people think

bam241 commented 1 year ago

digging up some old thread.

I would recommend black even if I don't really like the formatting results. The main interest of black in my opinion lies in the fact, that it don't check compliance and correct it, it formats the file. Consequently, there is an uniq output from black regardless of the initial indentation/formating of the file. (it appends with other python formater, that the resulting format depends on how a specific line had been initially spliced)

paulromano commented 1 year ago

Just wanted to make note that the popular ruff package now supports formatting.

shimwell commented 11 months ago

Just to mention I've been making use of darker to format just the parts of the file I've been changing prior to submitting PRs. It runs black on just the part of the code changed, partial formatting

MicahGale commented 7 months ago

@paulromano is suggesting a Rust utility?

So I have played around with this a little bit on my own branch. Here's my notes:

  1. Reformatting with black would reformat every file, and be a large diff.
  2. Ruff is very fast, but also does linting and currently has 150 errors in the codebase. These seem non-trivial to resolve.
  3. AFAIK black uses the same syntax trees that the interpreter does, so there's essentially no risk of things being broken by black.

My two cents:

  1. Rather than discussing what would be the perfect tool, let's have something that's good enough for now, and revisit later. So I have a branch that goes all-in on black but I have to debug some non-python issues prior to the PR.
  2. Long term I think Ruff is the direction to go, but that would be part of a larger linting and code cleanup project.
  3. I'm not a fan of darker because IMO there's no point in doing a piecemeal solution. I think formatting needs to be all or nothing.