pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.11k stars 406 forks source link

Add pre-commit hook to stop linting errors being pushed. #2632

Closed CompRhys closed 6 days ago

CompRhys commented 1 week ago

Motivation

I am so used to relying on pre-commit that I have forgotten to run manual linting when making PRs to botorch. This adds a simple pre-commit config to automate linting.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

This PR should not affect anything. There are tools to run pre-commit in ci that maintainers could add to the package: https://pre-commit.ci/

Related Issues:

https://github.com/omnilib/ufmt/issues/251 The pre-commit hook only seems to work for v2.3.0 of ufmt. This issue tracks the error following from ruff_api.

Balandat commented 1 week ago

In general I am supportive of this.

The main question I have is how we can ensure that this stays in sync with the versions as defined in https://github.com/pytorch/botorch/blob/main/requirements-fmt.txt. Note that the versions in that file are synced with our internal linting tools, and so these may be updated without us actively doing those updates. If we hard-code versions of the tools in .pre-commit-config.yaml then we're going to end up with version inconsistencies that may be annoying to deal with (i.e. the pre-commit hook will format things in a way that is inconsistent with our internal linters). Is there a way to programmatically source the versions for the pre-commit hooks from therequirements-fmt.txt file?

CompRhys commented 1 week ago

So your concern is immediately apparent given this v2.3.0 vs v2.8.0 issue there were 3 lines of code which were changed by ufmt here that differ to running ufmt format . on v2.8.0 manually with the most recent version.

Is the SSOT for the internal tools requirements-fmt.txt or is that file produced based on some other process?

Balandat commented 1 week ago

Is the SSOT for the internal tools requirements-fmt.txt or is that file produced based on some other process?

So the SSOT for this lives in our internal repos; the way it works is that if changes are made there this automatically gets exported to the different github repos that we have for OSS projects. So for all intents and purposes, from the PoV of OSS versioning and pre-commit hooks, the versions defined in requirements-fmt.txt are the SOT.

CompRhys commented 1 week ago

Having had a think about this I think the best I can do is define a script called in the pre-commit that parses the file and the action file and then fails if the two do not match. This isn't automatic but it would mean that anyone using the pre-commit would be forced to fix the divergence in order to satisfy the hook with would give eventual consistency to the SSOT -- it would only significantly diverge if no-one used the pre-commit.

That said given the linked issue that would put this PR on ice until https://github.com/omnilib/ufmt/issues/251 is resolved.

Has the internal team considered ruff as a replacement to ufmt and flake8, single tool, highly configurable and I've had great experience with its CI/pre-commit ease of use.

Balandat commented 1 week ago

Having had a think about this I think the best I can do is define a script called in the pre-commit that parses the file and the action file and then fails if the two do not match. This isn't automatic but it would mean that anyone using the pre-commit would be forced to fix the divergence in order to satisfy the hook with would give eventual consistency to the SSOT -- it would only significantly diverge if no-one used the pre-commit.

I think this makes sense to me, seems like a reasonable approach and I don't have any better one off the top of my head...

Has the internal team considered ruff as a replacement to ufmt and flake8, single tool, highly configurable and I've had great experience with its CI/pre-commit ease of use.

There are ongoing discussions, yes, but making these kind of changes in a monorepo of our size is no small effort. We'll keep plugged in and see where this goes.

CompRhys commented 1 week ago

This now implements the proposed solution and it's easy to do the downstream fix and set ruff-api in the action I defined and so not actually blocked by https://github.com/omnilib/ufmt/issues/251.

amyreese commented 1 week ago

Is there a way to programmatically source the versions for the pre-commit hooks from the requirements-fmt.txt file?

No, and this is a known/intentional limitation of pre-commit — they don't offer any option beyond explicit package lists in additional_dependencies, and have no desire to support requirements files. See https://github.com/pre-commit/pre-commit/issues/730

CompRhys commented 1 week ago

Thanks this overall lgtm, just some nits.

The formatting changes here, are those things that snuck through previously on our end? It doesn't seem like there are any changes in terms of the versions?

Will get the nits and then double check it still works.

They were flake8 errors I got running flake globally with pre-commit run --all-files, the versions of flake aren't pinned so potentially a divergence issue but to me at least they all seemed to make sense, f"{thing=}" rather than f"{thing = }" was most common.

Balandat commented 1 week ago

Makes sense, let's just import it and see if the internal linter complains... Main thing would be that flake8 honors the settings in settings.cfg: https://github.com/pytorch/botorch/blob/main/setup.cfg#L1

CompRhys commented 1 week ago

Checked the edge cases manually and tested that it takes the flake8 settings from setup.cfg by setting line length to 60 and getting huge numbers of faults. Added a clarifying note that everything should be pinned with == for the logic to work.

I think this is good to go.

facebook-github-bot commented 1 week ago

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 week ago

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Balandat commented 1 week ago

Hmm @CompRhys this seems like this is need of another rebase :(

CompRhys commented 1 week ago

Hmm @CompRhys this seems like this is need of another rebase :(

I think this is actually a real error based on the behaviour of f-strings f"{xyz=}" (which flake8 wants) being different to f"{xyz = }" as it keeps the spaces.

(base) ➜  botorch git:(pre-commit) python
Python 3.12.7 | packaged by Anaconda, Inc. | (main, Oct  4 2024, 08:22:19) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "abc"
>>> f"{a=}"
"a='abc'"
>>> f"{a = }"
"a = 'abc'"
>>> 
Balandat commented 1 week ago

I wasn't referring to the error as much as that I wasn't able to re-import the PR.

Also, did you mean to close this?

CompRhys commented 1 week ago

force-pushing on reset/rebase closes PRs

facebook-github-bot commented 1 week ago

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Balandat commented 1 week ago

So one thought I had to ensure that the hard-coded yaml versions stay in sync with the SOT in the requirements-fmt.txt: Can we add a simple unit test that validates that? That way we'd avoid that divergence upfront rather than breaking things for contributors. We have that validation logic already; ideally we could just factor that out and reuse the same code for both the validation and the unit tests - (though we'd probably have to do some hacking to import it in the script).

codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.98%. Comparing base (5d37606) to head (228a02c). Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2632 +/- ## ======================================= Coverage 99.98% 99.98% ======================================= Files 196 196 Lines 17362 17365 +3 ======================================= + Hits 17360 17363 +3 Misses 2 2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

CompRhys commented 1 week ago

So one thought I had to ensure that the hard-coded yaml versions stay in sync with the SOT in the requirements-fmt.txt: Can we add a simple unit test that validates that? That way we'd avoid that divergence upfront rather than breaking things for contributors. We have that validation logic already; ideally we could just factor that out and reuse the same code for both the validation and the unit tests - (though we'd probably have to do some hacking to import it in the script).

could I change the lint CI to be this:

      - name: Run pre-commit
        run: pre-commit run --all-files --show-diff-on-failure

as then that would check in CI and also creates a SSOT connection between the pre-commit and the lint action, or is the lint action created by some internal tool? (This is how many projects I have worked on that give pre-commit hooks have their linting actions setup).

I think doing it with the unit tests is more messy but could do if needed

Balandat commented 1 week ago

could I change the lint CI to be this:

Yeah I guess that could work? Though that would require setting up pre-commit and installing the deps into a virtualenv on every lint run, right? Would we have to edit the lint workflow to do that or does that happen automatically on the first run in a virgin setup?

facebook-github-bot commented 1 week ago

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 6 days ago

@Balandat merged this pull request in pytorch/botorch@3f2e2c7572ff41c9cfeeb6dea7a55776b238dc03.

Balandat commented 6 days ago

I realized we shold probably have updated the CONTRIBUTING readme - will do that in a follow up

Balandat commented 6 days ago

Ah nvm I forgot that you had already included that. Thanks a lot for the contribution, @CompRhys!

CompRhys commented 6 days ago

thanks for your patience, now to run it back for Ax