Closed CompRhys closed 6 days ago
In general I am supportive of this.
The main question I have is how we can ensure that this stays in sync with the versions as defined in https://github.com/pytorch/botorch/blob/main/requirements-fmt.txt. Note that the versions in that file are synced with our internal linting tools, and so these may be updated without us actively doing those updates. If we hard-code versions of the tools in .pre-commit-config.yaml
then we're going to end up with version inconsistencies that may be annoying to deal with (i.e. the pre-commit hook will format things in a way that is inconsistent with our internal linters). Is there a way to programmatically source the versions for the pre-commit hooks from therequirements-fmt.txt
file?
So your concern is immediately apparent given this v2.3.0 vs v2.8.0 issue there were 3 lines of code which were changed by ufmt
here that differ to running ufmt format .
on v2.8.0 manually with the most recent version.
Is the SSOT for the internal tools requirements-fmt.txt
or is that file produced based on some other process?
Is the SSOT for the internal tools requirements-fmt.txt or is that file produced based on some other process?
So the SSOT for this lives in our internal repos; the way it works is that if changes are made there this automatically gets exported to the different github repos that we have for OSS projects. So for all intents and purposes, from the PoV of OSS versioning and pre-commit hooks, the versions defined in requirements-fmt.txt
are the SOT.
Having had a think about this I think the best I can do is define a script called in the pre-commit that parses the file and the action file and then fails if the two do not match. This isn't automatic but it would mean that anyone using the pre-commit would be forced to fix the divergence in order to satisfy the hook with would give eventual consistency to the SSOT -- it would only significantly diverge if no-one used the pre-commit.
That said given the linked issue that would put this PR on ice until https://github.com/omnilib/ufmt/issues/251 is resolved.
Has the internal team considered ruff
as a replacement to ufmt
and flake8
, single tool, highly configurable and I've had great experience with its CI/pre-commit ease of use.
Having had a think about this I think the best I can do is define a script called in the pre-commit that parses the file and the action file and then fails if the two do not match. This isn't automatic but it would mean that anyone using the pre-commit would be forced to fix the divergence in order to satisfy the hook with would give eventual consistency to the SSOT -- it would only significantly diverge if no-one used the pre-commit.
I think this makes sense to me, seems like a reasonable approach and I don't have any better one off the top of my head...
Has the internal team considered ruff as a replacement to ufmt and flake8, single tool, highly configurable and I've had great experience with its CI/pre-commit ease of use.
There are ongoing discussions, yes, but making these kind of changes in a monorepo of our size is no small effort. We'll keep plugged in and see where this goes.
This now implements the proposed solution and it's easy to do the downstream fix and set ruff-api in the action I defined and so not actually blocked by https://github.com/omnilib/ufmt/issues/251.
Is there a way to programmatically source the versions for the pre-commit hooks from the requirements-fmt.txt file?
No, and this is a known/intentional limitation of pre-commit — they don't offer any option beyond explicit package lists in additional_dependencies
, and have no desire to support requirements files. See https://github.com/pre-commit/pre-commit/issues/730
Thanks this overall lgtm, just some nits.
The formatting changes here, are those things that snuck through previously on our end? It doesn't seem like there are any changes in terms of the versions?
Will get the nits and then double check it still works.
They were flake8 errors I got running flake globally with pre-commit run --all-files
, the versions of flake aren't pinned so potentially a divergence issue but to me at least they all seemed to make sense, f"{thing=}" rather than f"{thing = }" was most common.
Makes sense, let's just import it and see if the internal linter complains... Main thing would be that flake8 honors the settings in settings.cfg
: https://github.com/pytorch/botorch/blob/main/setup.cfg#L1
Checked the edge cases manually and tested that it takes the flake8 settings from setup.cfg
by setting line length to 60 and getting huge numbers of faults. Added a clarifying note that everything should be pinned with ==
for the logic to work.
I think this is good to go.
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
Hmm @CompRhys this seems like this is need of another rebase :(
Hmm @CompRhys this seems like this is need of another rebase :(
I think this is actually a real error based on the behaviour of f-strings f"{xyz=}" (which flake8 wants) being different to f"{xyz = }" as it keeps the spaces.
(base) ➜ botorch git:(pre-commit) python
Python 3.12.7 | packaged by Anaconda, Inc. | (main, Oct 4 2024, 08:22:19) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "abc"
>>> f"{a=}"
"a='abc'"
>>> f"{a = }"
"a = 'abc'"
>>>
I wasn't referring to the error as much as that I wasn't able to re-import the PR.
Also, did you mean to close this?
force-pushing on reset/rebase closes PRs
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
So one thought I had to ensure that the hard-coded yaml versions stay in sync with the SOT in the requirements-fmt.txt
: Can we add a simple unit test that validates that? That way we'd avoid that divergence upfront rather than breaking things for contributors. We have that validation logic already; ideally we could just factor that out and reuse the same code for both the validation and the unit tests - (though we'd probably have to do some hacking to import it in the script).
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.98%. Comparing base (
5d37606
) to head (228a02c
). Report is 1 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚨 Try these New Features:
So one thought I had to ensure that the hard-coded yaml versions stay in sync with the SOT in the
requirements-fmt.txt
: Can we add a simple unit test that validates that? That way we'd avoid that divergence upfront rather than breaking things for contributors. We have that validation logic already; ideally we could just factor that out and reuse the same code for both the validation and the unit tests - (though we'd probably have to do some hacking to import it in the script).
could I change the lint CI to be this:
- name: Run pre-commit
run: pre-commit run --all-files --show-diff-on-failure
as then that would check in CI and also creates a SSOT connection between the pre-commit and the lint action, or is the lint action created by some internal tool? (This is how many projects I have worked on that give pre-commit hooks have their linting actions setup).
I think doing it with the unit tests is more messy but could do if needed
could I change the lint CI to be this:
Yeah I guess that could work? Though that would require setting up pre-commit
and installing the deps into a virtualenv on every lint run, right? Would we have to edit the lint workflow to do that or does that happen automatically on the first run in a virgin setup?
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@Balandat merged this pull request in pytorch/botorch@3f2e2c7572ff41c9cfeeb6dea7a55776b238dc03.
I realized we shold probably have updated the CONTRIBUTING
readme - will do that in a follow up
Ah nvm I forgot that you had already included that. Thanks a lot for the contribution, @CompRhys!
thanks for your patience, now to run it back for Ax
Motivation
I am so used to relying on
pre-commit
that I have forgotten to run manual linting when making PRs tobotorch
. This adds a simplepre-commit
config to automate linting.Have you read the Contributing Guidelines on pull requests?
Yes
Test Plan
This PR should not affect anything. There are tools to run pre-commit in ci that maintainers could add to the package: https://pre-commit.ci/
Related Issues:
https://github.com/omnilib/ufmt/issues/251 The pre-commit hook only seems to work for v2.3.0 of ufmt. This issue tracks the error following from ruff_api.