pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.32k stars 636 forks source link

Add exit-non-zero-on-fix support for fix and fmt goals #20905

Open njgrisafi opened 5 months ago

njgrisafi commented 5 months ago

Is your feature request related to a problem? Please describe. We want to use Pants for various linting/formatting checks, including in a pre-commit hook. One challenge with Pants in that workflow is when we run pants fix fmt lint and it applies autofixes it doesn't exit with 1. We need pre-commit to fail so the developer has a chance to include those fixes in the commit.

By contrast, ruff has a --exit-non-zero-on-fix flag which allows this to be differentiated.

Describe the solution you'd like Maybe some global setting like pants --exit-non-zero-on-fix fix fmt

Then when autoformating or autofixing occurs the exit code is non-zero failing notifying the caller there were some issues.

Describe alternatives you've considered Inspecting changed files and parsing output from pants. This is not ideal.

I guess we could do pants lint fix fmt as long as it runs lint first then the next goals. However, the output can be confusing for developers if some lint failures were autofixed.

huonw commented 5 months ago

Summary: there's a way we could implement this for specific goals that'd handle the problem "locally", but maybe there's a bigger picture way to handle this.


To implement this for both fix and fmt, I think the relevant part of the code is https://github.com/pantsbuild/pants/blob/3ef507eab8ac8cf6dd2a8a1d21a7bf6972687a08/src/python/pants/core/goals/fix.py#L370

We could, hypothetically:

  1. add a --exit-non-zero-on-changes option to each of those goals, e.g. exit_non_zero_on_changes = BoolOption(default=False, help=...) in https://github.com/pantsbuild/pants/blob/3ef507eab8ac8cf6dd2a8a1d21a7bf6972687a08/src/python/pants/core/goals/fix.py#L216
  2. pass that value into calls to fix.py's _do_fix in fmt.py and fix.py
  3. within _do_fix , check any(batched_result.did_change for batched_result in batched_results) and that option to decide whether to set exit_code=0 or exit_code=1 (for that particular code, one approach would be returning True/False from _write_files based on whether anything was written).

However, I'm not sure that's the perfect approach, for two reasons:

  1. there's some other goals that could have this apply: GenerateSnapshots, GoGenerateGoal, Tailor, UpdateBuildFilesGoal. I think all of these (optionally) write file back to the repository, and thus it's interesting to pick up when there's changes. The last two in particular are common ones.
  2. returning non-zero from any particular goal will stop later goals from running. This is fine for fix and fmt (pants fix will run the formatters automatically too), but would mean something like pants --exit-non-zero-on-changes tailor update-build-files fix :: might stop after tailor. https://github.com/pantsbuild/pants/blob/3ef507eab8ac8cf6dd2a8a1d21a7bf6972687a08/src/python/pants/init/engine_initializer.py#L155-L156

So, I think that suggests two options to me:

  1. Do manual/ad-hoc implementations of checking for changes within each goal's code (solving 1), and then adjust the run_goal_rules loop to not early-exit if this option is set (solving 2).
  2. Try to do this at a higher level: if a --exit-non-zero-on-changes option is specified, do compare Pants' tracked files at the start & end of the entire run: if there's changes, "upgrade" a zero exit code to a non-zero one. This is then entirely goal independent and works automatically for all goals ever (even custom ones). One way to implement this would be to capture a Snapshot of the whole buildroot at start up, and one at the end, and compare them.

@stuhood does that last option sound reasonable? If so, any tips for implementing?

benjyw commented 5 months ago

Don't all formatters and fixers already have a Pants-level "check but don't edit" mode, for CI? I thought @thejcannon implemented this a while ago.

thejcannon commented 5 months ago

Don't all formatters and fixers already have a Pants-level "check but don't edit" mode, for CI? I thought @thejcannon implemented this a while ago.

Yeah that's lint 😅

benjyw commented 5 months ago

So sounds like just running lint in CI will have the desired effect? It runs all fixers and formatters?

On Mon, May 13, 2024, 8:27 AM Josh Cannon @.***> wrote:

Don't all formatters and fixers already have a Pants-level "check but don't edit" mode, for CI? I thought @thejcannon https://github.com/thejcannon implemented this a while ago.

Yeah that's lint 😅

— Reply to this email directly, view it on GitHub https://github.com/pantsbuild/pants/issues/20905#issuecomment-2107998678, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD5F7HSMYCYVE3ANCJ3P2TZCDLWRAVCNFSM6AAAAABHQ4W5OSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBXHE4TQNRXHA . You are receiving this because you commented.Message ID: @.***>

njgrisafi commented 5 months ago

It doesn't offer the desired effect. On git hooks we want to apply all fixes and formatting but have a non-zero exit code.

This is so the developer can include the fixes prior to pushing to CI.

benjyw commented 5 months ago

lint should exit non-zero, if I understand correctly. Why would you need it to actually apply the fixes and formatting on files that immediately get thrown away when the CI container exits?

huonw commented 5 months ago

For my use of something similar, the desired workflow is:

  1. write some badly formatted code & forget to format
  2. git push...
  3. .git/hooks/pre-push hook runs and fails (i.e. block the push), but still writes the changed files to disk so I can commit them immediately (or fix whatever other problem they surfaced)

I've switched to running read-only pants tailor --check update-build-files --check lint check :: in my pre-push hook. If there's a problem, I then have to run the write version of whatever failed, but it'd be nifty I didn't have to.

njgrisafi commented 5 months ago

For my use of something similar, the desired workflow is:

  1. write some badly formatted code & forget to format
  2. git push...
  3. .git/hooks/pre-push hook runs and fails (i.e. block the push), but still writes the changed files to disk so I can commit them immediately (or fix whatever other problem they surfaced)

I've switched to running read-only pants tailor --check update-build-files --check lint check :: in my pre-push hook. If there's a problem, I then have to run the write version of whatever failed, but it'd be nifty I didn't have to.

Yeah this exactly what are facing too. We would like to apply the fixes automatically so the developers do not have to run an additional commit to perform the fmt / fixes. At this point the fixes / fmting should be staged so they can review / commit.

njgrisafi commented 5 months ago

lint should exit non-zero, if I understand correctly. Why would you need it to actually apply the fixes and formatting on files that immediately get thrown away when the CI container exits?

@benjyw This is for running git hooks, in CI we already do this workflow.

benjyw commented 5 months ago

Ah, gotcha! So this is for running on desktops.

OK, so that option makes sense. Could be on the fix goal's subsystem, so it would be --fix-as-errors or some such name.

krishnan-chandra commented 5 months ago

Happy to take this task on. For the two options on how to implement from the parent comment:

So, I think that suggests two options to me:

  1. Do manual/ad-hoc implementations of checking for changes within each goal's code (solving 1), and then adjust the run_goal_rules loop to not early-exit if this option is set (solving 2).
  2. Try to do this at a higher level: if a --exit-non-zero-on-changes option is specified, do compare Pants' tracked files at the start & end of the entire run: if there's changes, "upgrade" a zero exit code to a non-zero one. This is then entirely goal independent and works automatically for all goals ever (even custom ones). One way to implement this would be to capture a Snapshot of the whole buildroot at start up, and one at the end, and compare them.

I think I can definitely implement the first option, but not sure how the second would work. It certainly seems like the snapshot could be really expensive if the repo size is large, and the individual goals may have smaller target sets compared to an entire buildroot.