Add pre-commit hooks for static code analysis and code formattaing

MArpogaus commented 8 months ago

Hello @francois-rozet,

I hope you're doing well. I wanted to share my recent changes. To make our codebase more consistent and maintainable, I've added some automated code formatters and static code analysis tools using the pre-commit framework.

(Edit: my first draft used a couple of linters (isort,flake8) and formatters (black, autoflake) in parallel. this adds unnecessary complexity, hence i decided to switch to a more modern approach using ruff)

Here's a quick overview of each hook and their repository links:

ruff An extremely fast Python linter and code formatter, written in Rust.
pre-commit-hooks: These hooks provide a range of useful features, including checking for trailing whitespace, validating YAML and TOML files, detecting accidentally added large files, and identifying merge conflicts.
language-formatters-pre-commit-hooks: The hooks in this repository enable pretty formatting of YAML and TOML files, making them more readable and following best practices.

By incorporating these automated code formatters and static code analysis tools into the project, we can save time during code reviews by minimizing manual fixes and ensuring that everyone follows consistent coding standards.

I'm really excited about these enhancements and how they'll benefit our collaboration. Let's discuss these changes further! Let me know If you have any concerns or questions.

Best regards, Marcel

MArpogaus commented 8 months ago

I noticed that the project uses the older setup.cfg= configuration file, and I wanted to suggest migrating to pyproject.toml.

This offers several advantages. I would like to highlight the two most important benefits from my point of view:

Consolidated project metadata: pyproject.tom serves as a single source of truth for various project configuration options, including dependencies, build systems, and package metadata. This consolidation simplifies maintenance and reduces the change of inconsistent or outdated information across different files.
Future-proof choice: pyproject.toml is the official standard for build configuration in Python, as specified by PEP 517 and PEP 518. By adopting it, we align our project with the direction of the Python community, ensuring long-term compatibility and support.

Specifically the pyproject.toml file can include the configuration of the formatters and linters i use in this PR.

I did the migration from setup.cfg to pyproject.toml in a recent commit. Please let me know if you accept this change, otherwise we can easily rollback.

francois-rozet commented 8 months ago

Hello @MArpogaus, thanks for the PR. I agree that automatic linting/formatting would be great. We already recommend black in the contributing guidelines, but don't enforce it yet. I wanted to switch to ruff at some point as it combines linting and formatting and is FAST.

So when I read your initial comment (with isort, flake8, black, ...), I decided to test Ruff and configure it to comply with the current conventions (see dev). But I now realize that you changed your PR for ruff as well :sweat_smile:

Now concerning the other changes,

We used setup.cfg because pyproject.toml did not support the dynamic __version__ and requirements.txt. It seems like it does now, so I am happy to switch! We should do it before adding ruff actually.
Leaving the name of the package in setup.py is necessary. Otherwise, GitHub cannot find its dependents (packages that use Zuko).
I have mixed feelings about pre-commits. Automatic tests are great, but I prefer explicitly asking the user to fix the code rather than silently modifying it.
For YAML and TOML files, I don't know if there is a standard, but we follow GitHub examples and Python examples.

More generally, as mentioned in the contributing guidelines, if you want to modify something (fix bug, add feature), first create a feature request issue. This allows to separate the discussions related to the bug/feature, from the discussions related to the fix/implementation.

francois-rozet commented 8 months ago

I just pushed a few commits that a) switch to a pyproject.toml configuration and b) apply ruff to the code base with settings that best match the current conventions (to be discussed) and c) add a workflow to check linting and formatting issues.

MArpogaus commented 8 months ago

Hello @francois-rozet,

I am happy to hear, that my PR gave you the final confidence to make the switch to pyproject .toml.

I am very for ignoring your Contribution guidelines, i will respect them in following contributions!

I agree with your concerns about auto-correcting linting errors, but I personally like the idea of rejecting commits with erroneous or incorrectly formatted code. If this is still of interest, I could rebase on your last commit and change the pre-commit configuration, to just run in linting mode.

francois-rozet commented 8 months ago

If this is still of interest, I could rebase on your last commit and change the pre-commit configuration, to just run in linting mode.

Definitely better if the pre-commit only rejects without silent modifications. Still, would it be possible to make that an opt-in behavior? Also, do the hooks run locally before every commit or remotely on push?

MArpogaus commented 8 months ago

Hello @francois-rozet,

yes, pre-commit hooks are always opt-in and only run locally if you don't add a specific CI workflow. However, I decided to add such an workflow, in order to run the test automatically after each push. The corresponding workflow definition can be found here.

If developers decide to integrate them into thiere git workflow, they can simply install the pre-commit package and install the hook via pre-commit install. This way each commit triggers the execution of the hooks and rejects the commit if any of them fails.

I hope this explanation helps.

Just let me know if you wish to integrate it or not and i will rebase and adjust my changes accordingly :-)

francois-rozet commented 8 months ago

Ok let's try that!

probabilists / zuko

Add pre-commit hooks for static code analysis and code formattaing #35