pyro-ppl / numpyro

Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
https://num.pyro.ai
Apache License 2.0
2.15k stars 234 forks source link

Updat .pre-commit hook version #1795

Closed juanitorduz closed 4 months ago

juanitorduz commented 4 months ago

Update pre-commit hook versions:


Edit: This was the original scope of this PR. After https://github.com/pyro-ppl/numpyro/pull/1795#issuecomment-2106398777 we decided to unco these changes as we want to keep pre-commit usage optional :)

In this PR, I want to suggest a way to improve the development experience by adding https://pre-commit.ci/ as a unique source of truth for the project's linter.

Motivation

Currently, the CI (GitHub actions) uses the command make lint, which runs ruff's check. The problem is that it depends on the local version of ruff and is therefore ambiguous. See:

https://github.com/pyro-ppl/numpyro/blob/2b85765e1b963b299a233e53b703df3dbfdc247e/Makefile#L3-L6

Suggested Solution

In other projects I have contributed to, we have seen the benefit of having this manager by https://pre-commit.ci/ to:

  1. Make sure we have the package's specific versions (revisions). See https://github.com/s3alfisc/pyfixest/pull/375
  2. Have periodic automatic PR updates. See, for example, https://github.com/s3alfisc/pyfixest/pull/417

The only thing you (NumPyro devs) need to do is to add the app to the repo (one click as described in https://pre-commit.ci/)

If we merge this one, we could even remove the lint from GitHub actions (see https://github.com/s3alfisc/pyfixest/pull/397), and the linter would be managed independently. That is, remove

https://github.com/pyro-ppl/numpyro/blob/2b85765e1b963b299a233e53b703df3dbfdc247e/.github/workflows/ci.yml#L35-L37

fehiepsi commented 4 months ago

Hi @juanitorduz, just want to double check my understanding:

juanitorduz commented 4 months ago

Hey @fehiepsi, thanks for your feedback. Here are some comments:

  • this suggests that we should add pre-commit to the dev dependencies

Yes! I just added it ✅

  • when a user makes a commit, ruff will report and fix the lint issues automatically

Yes! The user has to simply do once pre-commit install and then every commit will be checked. One can skip this check by adding the -n option. For example git commit -m"my commit without check' -n

  • occasionally, pre-commit-ci will make PRs to update the ruff version

Yes. They look like https://github.com/s3alfisc/pyfixest/pull/417

  • what happens if the local ruff version is mismatched?

In principle, once you do pre-commit install there is a separate environment with the required packages and versions (revisions) to run the pre-commit hooks. If you want to have it in your environment as well (say, for the IDE like VSCode), you can pip install ruff with the version from the pre-commit config (this is the unique source of truth ! This is why it is recommended to avoid ambiguities). If two different versions are installed, the one specified in the pre-commit config will be the one to be applied.

  • what happens if the local version is up-to-date but the pre-commit is out-dated?

As mentioned above, the pre-commit environment will always be in sync with the version from the config. This is the version in which all the checks (locally and in CI) will be used. If you install it in your environment, one needs to install the version from the .pre-commit config (only source of truth)

  • make lint now does the fix on numpyro and tests folders, so its behavior is changed

The checks will be done on whatever is specified in

https://github.com/pyro-ppl/numpyro/blob/1d0cedbd7129d6ed27ca5beeb0333957a4b194c1/.pre-commit-config.yaml#L8

It's up to us to decide which folders to consider. At the moment is doing it in both (numpyro and test) and checks are 🟢 :)

The ruff configurations from the pre-commit hooks are consistently specified in https://github.com/pyro-ppl/numpyro/blob/master/pyproject.toml. This is why they are incorporated.

To test this PR (and see if this is something you want to have), I suggest you create a fresh Python environment and make a dummy change without installing ruff and see how everything is taken care of by the pre-commit hook. Of course, I am here to answer all questions (I might not be able to answer edge cases, but we can learn together :) )

fehiepsi commented 4 months ago

Thanks, @juanitorduz! Just a couple of thoughts:

juanitorduz commented 4 months ago

Thanks for the clarification. Here are some answers to your valid concerns:

  • since the introduction of ruff, we have made some release with failing lints because CI was green. ruff did not format and fix the codes properly because of wrong configurations (see above PRs). I'm not sure if this precommit change will introduce other wrong configurations.

I am sorry to hear! Feel free to open issues if you see this happening again and I will take a look at it. In my experience, having everything synced with pree-commit does help (this is just a personal opinion based on previous experiences)

  • the need to fix lint before committing is inconvenient in some cases. I would prefer committing temporary code before fixing lint issues (when the changes are large). Could we make this optional?

This makes sense when developing. As described before you can add the -n and the checks won't be done.

  • I guess we also need to check lint in other folders like examples, notebooks as well

Indeeed! Some projects do it and some projects do not. If you want to do it we can have a separate issue / PR to tackle this and atdd the examples folder to the ruff configurations.

  • typically, make lint is used to check for issues. make format will try to fix them, if possible.

In view of this observation, I moved the pre-commit to make format in https://github.com/pyro-ppl/numpyro/pull/1795/commits/78b77e9873b3e6f25a0c3ae72b9f27c7ffc0db34

@fehiepsi, think about the proposed changes, and if you are not convinced, it is totally ok (this was just a suggestion). We can close this one and come back once we are ready :)

fehiepsi commented 4 months ago

Thanks for your clarifications, @juanitorduz! I'm also learning. I think we will want to make this optional for contributors who prefer using precommit, rather than requiring it.

Regarding the configuration, the upstream doc seems to suggest that an entry for ruff format needs to be specified in the yaml file. In addition, currently our make lint and make format cover all folders so precommit is lacking some checks.

How about just enhancing precommit in this PR?

juanitorduz commented 4 months ago

Sure! Let's do it! As we want to keep pre-commit optional, I just left the update in the pr-commit-hooks version. I think is then better to leave the ruff local version and remove the automated PRs as this is not relevant for all users :)

juanitorduz commented 4 months ago

LGTM. It would be nice if you can update the precommit to include the id ruff-format and support for other folders like notebooks and examples. This will make precommit aligns with what make format does. But it is fine if you just want to use it to fix lint (not format) issues on numpyro and tests folders.

Added in https://github.com/pyro-ppl/numpyro/pull/1795/commits/015602844ce5ebf9edf41d269467f7d6c82b75ad :)

I also ran pre-commit run --all-files and everything is green except for the trailing spaces in https://github.com/pyro-ppl/numpyro/pull/1795/commits/2021016d524fbc98d2d81103007f7c6fb1d4da14 :)

juanitorduz commented 4 months ago

Thank you for your feedback and patience @fehiepsi 🙌