Add code formatting to CI

hancush commented 2 years ago

Initial ideas: Black for Python, Prettier for JavaScript.

Adding to CI means we can all develop locally as we like and the code is formatted in a central, common space.

Next steps: Add to existing project for evaluation. If we like it, add it to the project cookiecutters.

antidipyramid commented 2 years ago

There are (at least) two ways this can be implemented, both using Github Actions:

automatic code formatting
automatic code linting

The former would require running a code formatter using Github Actions after a developer makes a commit to the repo. The issue is that the Action would have to then execute a second commit that formats the just published code. Not only does this have the potential to double the number of commits in a repo, it would then require developers to pull the formatted code into their local repo after every push to Github.

The latter avoids these issues but would require Datamade as a team to decide on some standard conventions (single vs double quotes, spaces after parens, semicolons, etc.). Currently in EFI, we have a Github Action in place that runs flake8 on every PR made against the main branch. We could implement something similar for Javascript using eslint.

Another option would be to run linters using a pre-commit hook. That would not allow changes to even be committed (to any branch) without passing linting tests.

I'll try to get eslint working on EFI as a test for PRs first. But I do think that using pre-commit hooks could be interesting-- we wouldn't have to wait for Github to build and run the Docker containers to get a notice that our code failed the linting tests.

antidipyramid commented 2 years ago

I've been looking at tools for code formatting/consistency across our codebase. Git pre-commit hooks are a way to run code locally before a commit is made. Usually, pre-commit hooks are used to run code formatters (e.g. Prettier, black) and linters (e.g. flake8, eslint).

Let's say we start using pre-commits and you commit a change to a file:

If we've configured black to run, for example, and black formats your new code, the commit would be aborted, you'd get an error message, and your file would now reflect black's changes. You'd then re-stage the file, commit, black would run again and since there is nothing left to do, the commit would proceed as usual.
If we've configured linter(s) to run, you would simply get error messages that would prevent a commit from going through.

I definitely like running linters pre-commit. It's a bit irritating to have flake8 yell at you 10 minutes after you push a PR. I do also like the fact that formatters can be run only on changed code. But pre-commit formatters might conflict with (or duplicate) local formatting configs. Some people also find pre-commit linting/formatting to be annoying or invasive, which is valid.

Feel free to pull down this branch of EFI and play around with it. Mess up some js/python files and see what happens.

Pre-commit hooks must be set up locally so I added instructions in the README. It's just two terminal commands.

derekeder commented 1 year ago

We have set up pre-commit for the workplace-di project and will use it throughout our development.

PR with the configuration changes: https://github.com/datamade/workplace-di/pull/15
pre-commit config: https://github.com/datamade/workplace-di/blob/main/.pre-commit-config.yaml
instructions on using it: https://github.com/datamade/workplace-di#set-up-pre-commit

pre-commit is set up to run the following whenever you run git commit on the repo:

black - code formatter for python
flake8 - linter for python
prettier - code formatter for javascript
eslint - linter for javascript

black and prettier will automatically format your code, but it will abort the commit if it makes any changes. You can then review them and proceed with adding and committing again. The flake8 and eslint linters will also run, but do not fix any errors they raise. It will be up to the developer to fix these linting errors before the commit is allowed to be made.

Why use both a code formatter and linter?

I had the same question. The short answer is, though there is some overlap in what they do, linters (flake8 and eslint) check for things that the formatters (black, prettier) do not handle for you, like when you declare variables you don't use, or reference things that aren't defined.

Additionally, we decided to not run any formatters in GitHub actions because we don't want any changes to be made by the codebase that aren't overseen by someone. Formatters set up with GitHub actions will make your changes and then commit them separately for you, which can lead to confusion, superfluous commits in your history, and also require you to wait and pull down any changes before making future commits. Additionally, we already have good patterns for running flake8 and eslint in our projects. This pre-commit setup is intended to help catch those errors earlier in the commit cycle and speed up the feedback loop for formatting errors.

Next, deciding on our javascript conventions

black is pretty opinionated, so there's not much to configure. eslint on the other hand, has a lot of flexibility. In workplace-di we opted to use eslint:recommended, react:recommended and prettier but there are other options: https://betterprogramming.pub/comparing-the-top-three-style-guides-and-setting-them-up-with-eslint-98ea0d2fc5b7

We should decide what conventions we want to follow so we can set up our eslint configurations consistently across projects

derekeder commented 1 year ago

@smcalilly and @fgregg will set up Pre Commit for IL NWSS.

@hancush has set it up for the councilmatics

Report back at next R&D and see if we're ready to add this setup to cookie cutter.

smcalilly commented 1 year ago

i tried setting up based on the workplace-di repo and got this error:

git commit -m "add precommit hooks"             
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
An unexpected error has occurred: CalledProcessError: command: ('/Library/Developer/CommandLineTools/usr/libe
xec/git-core/git', 'fetch', 'origin', '--tags')
return code: 128

expected return code: 0
stdout: (none)
stderr:
    fatal: unable to connect to github.com:
    github.com[0: 140.82.112.4]: errno=Connection refused

Check the log at /Users/sammcalilly/.cache/pre-commit/pre-commit.log

for some reason i can't use https addresses for the repo key in .pre-commit-config.yaml (from this in the docs). i changed to git@ and it worked (git@github.com:pre-commit/pre-commit-hooks.git instead of https://github.com/pre-commit/pre-commit-hooks). idk if that's something to do with my local network or system or what.

otherwise it's working good

derekeder commented 1 year ago

we're adopting this. it's included in our templates

datamade / how-to

Add code formatting to CI #247

Why use both a code formatter and linter?

Next, deciding on our javascript conventions