diffplug / spotless

Keep your code spotless
Apache License 2.0
4.57k stars 458 forks source link

spotlessApply --staged #623

Open nedtwigg opened 4 years ago

nedtwigg commented 4 years ago

Spotless applies itself to the files on disk. However, when git makes a commit, it does not use the files on disk, it uses the staging area. If you do this:

change foo.txt
spotlessApply
git add foo.txt
git commit -m "blah"

Then your commit will definitely pass a spotlessCheck. But if you do this, it might not:

change foo.txt
git add foo.txt
spotlessApply         (changes foo on disk, but not the staged/index foo)
git commit -m "blah"  (you committed the dirty foo, not the clean foo)

aside: this three-files-per-file model, with an independent index, is confusing to beginners, and imo of limited utility even to experts, which is why DiffPlug doesn't have a staging area.

@lowwor made an excellent git hook script which uses the stash to run spotlessCheck on staged files.

With our new git integration, it would now be relatively easy for Spotless to have a mode which operates on the staging area directly. This is especially useful for pre-commit hooks, which could either fail on badly formatted content spotlessCheck --staged, or silently fix the commit right before it is created spotlessApply --staged.

Due to merge conflicts that this is likely to generate, implementing this is blocked on #603 and #600. PR's welcome!

nedtwigg commented 4 years ago

The relevant parts are a combination of GitRatchet.java and the IDE hook PR.

Because the total number of files staged is typically so few, there is potentially a meaningful speedup to be had from not configuring every project, and limiting yourself to only those projects which are parents of the files in question. So the right command is probably not spotlessApply --staged but :spotlessApply --staged, and then the root task would dynamically figure out which other spotlessApply tasks it might depend on...

tjni commented 4 years ago

This is quite interesting, and something I've been thinking about how to leverage properly.

Before I knew about the ratchet feature, I was building a Git hooks plugin to copy the functionality of husky and lint-staged for Gradle projects at work. It follows the same strategy as those projects, except it tries to expose the list of staged files to tasks and plugins by setting a Gradle property. You can see the idea expressed on the project page, and you can see some of the logic that handles dealing with the staging area in this file.

There are some caveats I've observed (some related only to how I've implemented it, and some that other implementations would need to deal with).

  1. I use git directly instead of JGit, and that means that Git is a dependency, and there is some minimum version required.
  2. There are almost certainly bugs, as proper handling of all conditions (e.g. what happens if Gradle tasks fail, what happens if merges fail, what happens if the daemon OOMs, what happens if there is low disk space, etc.) seems to be hard.
  3. This plugin is maintained right now only by me for work projects, which shouldn't inspire community confidence 😄 .

With that said, I'm wondering whether you think some of this logic can be extracted and reused in Spotless, or if Spotless should delegate this behavior to other solutions while maintaining an integration API (which might exist already).

nedtwigg commented 4 years ago

I left feedback on your captain-hook project in your issue tracker, looks like a great project!

I'm wondering whether you think some of this logic can be extracted and reused in Spotless, or if Spotless should delegate this behavior to other solutions while maintaining an integration AP

Yes and yes. We maintain a stable API, both at the lib level and also at the plugin-gradle level, so that people can reliably build on top of us. But we also try to pull all the common usecases into our feature set and stick unit tests on them. If people are cobbling two things together over and over, might as well mush them together and stick a unit test on it to make sure it keeps working.

I use git directly instead of JGit, and that means that Git is a dependency, and there is some minimum version required.

We do use JGit, but we also shell out to Git in some places, so that's not a showstopper.

The showstopper, in terms of getting merged, is about whether the docs are simple. spotlessApply and spotlessCheck are both really simple, very easy to explain:

user@machine repo % ./gradlew build
:spotlessJavaCheck FAILED
  The following files had format violations:
  src\main\java\com\diffplug\gradle\spotless\FormatExtension.java
    -\t\t····if·(targets.length·==·0)·{
    +\t\tif·(targets.length·==·0)·{
  Run './gradlew spotlessApply' to fix these violations.
user@machine repo % ./gradlew spotlessApply
:spotlessApply
BUILD SUCCESSFUL
user@machine repo % ./gradlew build
BUILD SUCCESSFUL

The problem with the staging area is that it makes things more complicated. I like spotlessApply --staged and spotlessCheck --staged because we can say "it's just like regular spotlessApply/Check, except it only applies to files in the staging area". But even then, there is a hard case which gets swept under the rug:

So normally, if you run spotlessApply, you would have if (foo) { bar } in your WC, and if foo then bar in your staging area. If you run spotlessApply --staged, you have the inverse. And it's a little strange to do git add, git commit, and have one of the files you just added be dirty in your WC. So by default, you probably want spotlessApply --staged to change both the staging area and the WC, so long as they are equal. But, if the WC and staging area are different, then you definitely want spotlessApply --staged to only apply to the staged copy, and leave the WC alone to make sure you don't lose any user data.

That paragraph above is pretty complicated. I think you could leave it out of the feature docs, because you're never losing user data, and it mostly just "does the right thing". But if you add even a little bit more complexity to the intended behavior, it gets really hard for users to understand what they're getting, and what the intended behavior even is.

I don't have plans to implement this feature anytime soon, so PR's are absolutely encouraged. But I would start with the docs first, and feel free to open a PR with just the documentation change. If you get buy-in on the docs, then you are guaranteed to get merged eventually. That goes both for implementing spotlessApply --staged, or for implementing whatever hooks you need (you should be able to do a lot with our public API already).

shyvum commented 2 years ago

@nedtwigg Is this supposed to be available soon? I'm looking for a solution to spotlessApply before every commit.

nedtwigg commented 2 years ago

@ShivamGoyal1899 PR's welcome, no work is scheduled. There's a workaround linked above, search "git hook script".

shyvum commented 2 years ago

@nedtwigg I actually tried creating a simple pre-commit hook in combination with ratchetFrom and it seems to work for me in cases where I really want to raise a PR on the ratchet branch. In other cases, it just trashes everything. Do you see any more drawbacks with this?

#!/bin/sh
set -e
./gradlew -PdisableSpotlessCheck spotlessApply
git update-index --again :/:
nedtwigg commented 2 years ago

I don't think you need -PdisableSpotlessCheck, but otherwise it looks good to me. It's true that if you're working on a branch that isn't related to the ratchetFrom then you'll get weird behavior, but I think that's the "correct" behavior.

nedtwigg commented 1 year ago

A good idea from git-code-format-maven-plugin is for the plugin to setup the git commit hooks itself. Worth looking into.

shyvum commented 1 year ago

We created the hook manually, as I suggested above, but it also seemed to run on every back merge we do from upstream branches, making the change so big that it's very hard to review.

Also, as this was on a production codebase with almost 10-15 PRs a day with 100+ files each, the number of conflicts we were getting was intolerable.

We ended up removing the hook, making the process back to normal till our entire codebase is formatted.

nedtwigg commented 1 year ago

I think ratchetFrom would improve the situation for you, but doing one "stop the world and format everything" commit isn't a terrible idea to keep the noise from the change in one place.

shyvum commented 1 year ago

Agreed with the format everything once idea. Right now, we are trying to pick chunks and reformat, once it reaches around half of the entire project, the plan is to find a sweet spot where we have least PRs and format remaining in one go.

bddckr commented 1 year ago

Just in case it helps anyone who wants to work on this issue, or has the same need of "all files in my staging area must be this way", here's my git pre-commit hook to do that. I think the only way I've seen this fail in six or so months of using it now is when you stage just a few lines in a file that Spotless then reformats in a way so the remaining unstaged changes can't be reapplied on top of the latest formatted state. I stage and commit single lines/range of lines/hunks all the time, and have only run into that once or twice, though.

#!/bin/bash

set -euo pipefail

pop_stash() {
  if [ "$previous_stash" != "$new_stash" ]; then
    echo "Popping stash and restoring unstaged changes..."
    git stash pop --quiet
    git restore --quiet --staged .
  fi
}

echo "Temporarily comitting staged changes..."
git commit --quiet --no-verify --message 'Staged changes'

echo "Temporarily stashing unstaged changes including new files..."
git add --all
previous_stash=$(git rev-parse -q --verify refs/stash)
git stash push --quiet --staged --message 'Unstaged changes'
new_stash=$(git rev-parse -q --verify refs/stash)

echo "Reverting temporary commit to bring back staged changes..."
git reset --soft HEAD^

echo "Running Gradle ':format' task on staged changes..."
if ! ./gradlew :format; then
  echo "Gradle execution failed"

  echo "Temporarily comitting staged changes..."
  git commit --quiet --no-verify --message 'Staged changes'

  pop_stash

  echo "Reverting temporary commit to bring back staged changes..."
  git reset --soft HEAD^

  exit 1
fi

echo "Staging formatting changes (if any) and temporarily comitting staged and formatted changes..."
git add --all
git commit --quiet --no-verify --message 'Staged and formatted changes'

pop_stash

echo "Reverting temporary commit to bring back staged and formatted changes..."
git reset --soft HEAD^
sparecycles commented 9 months ago

I'm trying out <rachetFrom>HEAD</> and a simple spotless:check pre-commit hook to ensure that only changed files get reformatted. Initial testing seems like it's a big improvement from our checkstyle setup.

Curious if anyone has any thoughts/experience with these rules?

AchrafBelkahla commented 1 month ago

Hello Trying to use spotless for a gradle project we have. We want to integrate it in the build The issue is that we have several target branches, and going with the ratchetFrom alternative will force us to change each time the build file :/ Tried using the target param by passing the 'git diff --name-only --diff-filter=ACM'.execute().text.readLines()..findAll { it.endsWith('.java') } But unfortunatly this cannot be done at execution time of the gradle task Spotless always takes the target value defined when configuring the plugin

Do you see a way of using spotless within the build and apply it to only changed files ? (Without using a git hook )