Standardise and configure formatting, linting, and type checking

amritpurshotam commented 4 years ago

Currently in Baobab there is no standard formatting applied to the project which has resulted in each developer applying their personal preferences, their IDE defaults or what they configured themselves. This results in a code base that's harder to read and makes PRs diffs unnecessarily larger than they need to be which distracts from the logical code changes that were made.

Additionally given that we have many volunteers that have never worked a large scale software development project before, it's also important that we add some guardrails to the project (with linters, type checkers, etc) to point out common stylistic and syntactical errors and even potentially insecure code usages. Even for the more experienced, this is still very important to have since it just makes development easier and saves time (extra important since people are volunteering).

The proposed stack to solve the above would be as follows

Formatters

black for Python prettier for JavaScript

Linters

flake8 for Python along with the below plugins

bugbear for additional sylistic warnings to flake8
- builtins to make sure builtins aren't used as arguments
- comprehensions to suggest better comprehensions
- fixme to flag temp dev notes like TODOs

eslint for JavaScript

Type Checkers

mypy for Python. This one would require us to start using type hints though. Not sure if this would add more or less complexity for a junior developer. flow for JavaScript

Security

bandit for Python to check for common security issues in the code

flake8-bandit to integrate with flake8 to centralise the warning messages

Now to automatically setup and apply the above to the project, I propose to apply these libraries at three levels, at the IDE, Git commit time, and finally on the CI build. I know it's a little bit of overkill :)

Level 1: IDE

This would arguably be the most important time to flag the warnings/suggestions as the IDE can give immediate feedback inline. So we would need to override whatever the person has setup and apply workspace/project level settings to the project (by committing them to the repo so no manual configuration necessary for the developer). The problem is everyone uses a different IDE so perhaps we can prioritise setup according to the most common IDEs or perhaps just maybe ask our current set of volunteers what they use. @SMHall94 do you think we can ask them or would it just annoy people? Based on what I've seen people use and what I've read I think setting up VS Code and PyCharm would cover most people.

Level 2: Git Hook

At commit time, the checks should be run again in case they changed their IDE settings or used something else entirely and if it fails the commit will be rejected with a message indicating the problems that need to be fixed (with many of the hooks even doing the fix for you). With pre-commit we should be able to run all of the above (I still need to double check the JS ones but I don't see why not) with some nice additonal hooks like checking for merge conflicts, large files, private keys, etc.

Level 3: Add checks to the CI build

Then finally again we can run the checks during our CI build (I've checked they should all be possible). If the developer didn't try to bypass the IDE and git hooks, then these should always pass which should hopefully reduce the number for review cycles required. On this note though, what does everyone think about switching to GitHub Actions? I've played around with it and I'm really liking it. And I think it would be nice to centralise everything within the GitHub ecosystem. @avishkar58 I saw you did start to setup something up in #618 and #655

Let me know what everyones thoughts are on this solution. I think @jaderabbit and @avishkar58 can you let me know your thoughts on the Python libraries and @avishkar58 and @KaleabTessera on the JS ones

Also I think waiting until we move to Python 3 (especially the mypy checks) would be best so dependent on #454

avishkar58 commented 4 years ago

Thanks for describing this so comprehensively! I did start adding linting and formatting but ran into issues with python2.7 support, so decided to wait until we upgrade

On Wed, 15 Apr 2020, 12:01 amritpurshotam, notifications@github.com wrote:

Currently in Baobab there is no standard formatting applied to the project which has resulted in each developer applying their personal preferences, their IDE defaults or what they configured themselves. This results in a code base that's harder to read and makes PRs diffs unnecessarily larger than they need to be which distracts from the logical code changes that were made.

Additionally given that we have many volunteers that have never worked a large scale software development project before, it's also important that we add some guardrails to the project (with linters, type checkers, etc) to point out common stylistic and syntactical errors and even potentially insecure code usages. Even for the more experienced, this is still very important to have since it just makes development easier and saves time (extra important since people are volunteering).

The proposed stack to solve the above would be as follows Formatters

black https://black.readthedocs.io/en/stable/ for Python prettier https://github.com/prettier/prettier for JavaScript Linters

flake8 https://flake8.pycqa.org/en/latest/ for Python along with the below plugins

bugbear https://github.com/PyCQA/flake8-bugbear for additional sylistic warnings to flake8

builtins https://github.com/gforcada/flake8-builtins to make sure builtins aren't used as arguments

comprehensions https://github.com/adamchainz/flake8-comprehensions to suggest better comprehensions

fxime https://github.com/tommilligan/flake8-fixme to flag temp dev notes like TODOs

eslint https://github.com/eslint/eslint for JavaScript Type Checkers

mypy https://github.com/python/mypy for Python. This one would require us to start using type hints though. Not sure if this would add more or less complexity for a junior developer. flow https://github.com/facebook/flow for JavaScript Security

bandit https://github.com/PyCQA/bandit for Python to check for common security issues in the code

flake8-bandit https://github.com/tylerwince/flake8-bandit to integrate with flake8 to centralise the warning messages

Now to automatically setup and apply the above to the project, I propose to apply these libraries at three levels, at the IDE, Git commit time, and finally on the CI build. I know it's a little bit of overkill :) Level 1: IDE

This would arguably be the most important time to flag the warnings/suggestions as the IDE can give immediate feedback inline. So we would need to override whatever the person has setup and apply workspace/project level settings to the project. The problem is everyone uses a different IDE so perhaps we can prioritise setup according to the most common IDEs https://insights.stackoverflow.com/survey/2019#technology-_-most-popular-development-environments or perhaps just maybe ask our current set of volunteers what they use. @SMHall94 https://github.com/SMHall94 do you think we can ask them or would it just annoy people? Level 2: Git Hook

At commit time, the checks should be run again in case they changed their IDE settings or used something else entirely. With pre-commit https://pre-commit.com/ we should be able to run all of the above (I still need to double check the JS ones but I don't see why not) with some nice additonal hooks https://github.com/pre-commit/pre-commit-hooks like checking for merge conflicts, large files, private keys, etc. Level 3: Add checks to the CI build

Then finally again we can run all of the above as checks during our CI build (I've checked they should all be possible). If they didn't try to bypass the IDE and git hooks, then these should always pass. On this note though, what does everyone think about switching to GitHub Actions? I've played around with it and I'm really liking it. And I think it would be nice to centralise everything within the GitHub ecosystem. @avishkar58 https://github.com/avishkar58 I saw you did start to setup something up in #618 https://github.com/deep-learning-indaba/Baobab/pull/618 and #655 https://github.com/deep-learning-indaba/Baobab/pull/655

Let me know what everyones thoughts are on this solution. I think @jaderabbit https://github.com/jaderabbit and @avishkar58 https://github.com/avishkar58 can you let me know your thoughts on the Python libraries and @avishkar58 https://github.com/avishkar58 and @KaleabTessera https://github.com/KaleabTessera on the JS ones

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deep-learning-indaba/Baobab/issues/664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFV53AFZLTGR5SGWGOMJ5LRMWHYPANCNFSM4MIPHRGA .

smhall97 commented 4 years ago

This is excellent! To answer your question, @amritpurshotam I don't think people will be annoyed. It might be worth sending out a "contributor survey" - we can have people indicate their preferred IDE and then use it as a tally. We can mix the question in with other feedback questions we might find useful (such as feedback on our documentation etc. - similar to the Hackathon feedback form)

I will start drafting some questions and share them for review and further brainstorming.

avishkar58 commented 4 years ago

To reduce the burden of supporting multiple configurations we could add a recommended IDE to our docs (Visual Studio Code?) and start with configuration files for that, and tell people if they want to use another IDE they should try to configure it similarly

amritpurshotam commented 4 years ago

Okay sure that sounds great @SMHall94 !

Yeah I think Visual Studio Code too unless anyone else feels differently. I think I've pretty much figured those settings out for VS Code so that should be good to go. I just need to figure out some details around Docker but don't think that should be too complicated. And if we're settling on a recommended IDE, I think it would be useful to also include a selection of extensions to use with it (~these will obviously have to be in a separate section for manual setup for the person~ seems we can commit them to the repo). For example I use the following

Python (obviously :P)
GitLens
Docker
autoDocstring (we'll need to settle on a format here then)
Bracket Pair Colorizer (optional)

@avishkar58 Do you have any suggestions for extensions that would be useful for the JavaScript/ReactJS?

smhall97 commented 4 years ago

I started a document in case we decide to go through with the contributor feedback form :)

amritpurshotam commented 3 years ago

Should add standardisation of line endings as well to the project. See commits in #886 where line endings had to be manually updated to make sure the diffs were correct.

KaleabTessera commented 3 years ago

@smhall97 @amritpurshotam You two okay if I give this a go?

amritpurshotam commented 3 years ago

@KaleabTessera sure it's fine with me

smhall97 commented 3 years ago

@KaleabTessera That sounds good to me!

deep-learning-indaba / Baobab