sds / overcommit

A fully configurable and extendable Git hook manager
MIT License
3.91k stars 280 forks source link

Changes lost which are made attempting a commit #135

Closed pratik60 closed 9 years ago

pratik60 commented 9 years ago

Let me explain this with an example

I try to commit two files. git commit -am "test"

Now the overcommit precommit hooks come into place, and it keeps starts going through the various checks I've enabled. It reports an error in the initial check itself, and I promptly go to the file to edit the change while I keep the pre-commit hook running so that i may see the complete list of changes at the end.

But after the pre-commit hook is over, all the changes made during this time are completely lost as it does some kind of reset.

Any way I can fix this? Why is overcommit trying to be smart and restoring file, in case the checks fail....?

Any quick fix?

pratik60 commented 9 years ago

Can almost certainly say the problem is somewhere here (lib/overcommit/hook_context/pre_commit.rb) , can you please tell me how to disable this? I'm assuming it isn't intentional but I really don't understand why overcommit is trying to do a reset hard on my code. :-(

Please reply soon.

sds commented 9 years ago

Hey @pratik60,

This is intended behavior, but I understand why it seems confusing, so let me explain.

Overcommit's pre-commit hook needs to stash your unstaged changes so they don't affect the outcome of the hook run (since the unstaged changes aren't going into the commit, they shouldn't be checked by the hooks).

Once the checks have finished running, git requires us to git reset --hard in order to restore your repo to the state it was before the hook ran. This is unavoidable because you can't apply a stash to a dirty working tree.

You may ask why we don't do this in a separate directory with temporary files, and the reason is that it would mess up all the paths. Tools like rubocop use configuration files with paths that would break if we started doing all of this in temporary directories and files, so we stopped doing that in order to make things work.

Hope that explains things. In a nutshell, this is not a bug but Overcommit working as intended. I recommend you not make changes to your files while commit hooks are running.

pratik60 commented 9 years ago

@sds - We are still trying to wrap our heads around this.

"Once the checks have finished running, git requires us to git reset --hard in order to restore your repo to the state it was before the hook ran. This is unavoidable because you can't apply a stash to a dirty working tree."

If the precommit hook has failed, why are you required to restore my repo to the state before hook? Why can't you just let it be?

Anyways our rpsec, and jasmine suite runs as a part of precommit hook, but rubocop runs before it, and throws up error, and keep rspec and jasmine running in background. I naturally want to go and fix it, but then all my changes are lost as you do a git reset --hard after 5 minutes.

I cannot believe this is the long term solution :-(

I want a great developer experience, i love how easy it is to integrate stuff, the features it provides etc. But I can't go and tell all devs, to wait for 5 minutes because any changes made now will be lost.

jawshooah commented 9 years ago

If the precommit hook has failed, why are you required to restore my repo to the state before hook? Why can't you just let it be?

All unstaged changes are stashed before the hooks run. To restore those changes if a hook fails, we must first run git reset --hard because, as @sds explained, we can't apply the stash to a dirty working tree, which is exactly what you are creating when you change files while the hooks are running.

If we didn't git reset --hard after the hooks run, then we couldn't apply the stash containing your previous unstaged changes, and there would be a ton of bug reports like #129 claiming that file changes were lost.

jawshooah commented 9 years ago

Besides, and this is just my opinion, you shouldn't be running your entire test suite on each commit. Each push, maybe, but each commit is definitely overkill.

Git encourages us to keep our commits as small and atomic as possible, which means that commits will probably be made fairly often, while pushes should happen only when you are satisfied that your local copy works as intended.

For example, say you make a change to an existing feature that intentionally breaks previously expected behavior. You would probably have separate commits for the change itself and for modifying the tests accordingly. You would only expect the tests to pass after both commits.

To me, this is a strong argument for adding pre-push hook support. I'll get on it when I have a chance.

sds commented 9 years ago

Hey @pratik60,

As @jawshooah has alluded to, Overcommit isn't intended to be used for running entire test suites before committing. You can use overcommit --run as part of a CI run for that.

Git commit hooks (especially pre-commit hooks) should be fast. They are intended to give you quick feedback before sending your commit through a heavier duty series of tests and code review (which take much longer).

@jawshooah brings up an interesting suggesting of using pre-push to run actual integration tests. I don't feel strongly about this suggestion but I don't think hooks are the right place to run your CI suite, as you technically can't guarantee your developers have the hooks installed.

Circling back to your original complaint about Overcommit stashing/resetting/unstashing changes: this is absolutely necessary so that when Overcommit runs hooks against your files it is only reporting issues for changes that are about to be committed.

If you were to make changes to files during a hook run and Overcommit didn't attempt to isolate, you would potentially see errors/warnings introduced because you made changes during the hook run, even though those changes aren't actually about to be committed. It's essentially a concurrency problem where multiple threads (Overcommit and yourself) are trying to work with a shared resource (the files). To guarantee its usefulness, Overcommit needs full control of the resource during the hook run.

Hope that explains things, and I hope it's clear why you should not be running your test suite as part of a pre-commit hook. Leave that to your CI runs.

jawshooah commented 9 years ago

@sds Definitely agree that running test suites should primarily be the responsibility of your CI environment. But pushing broken code and waiting for the CI server to complain about it introduces (what I consider to be) unnecessary overhead.

A pre-push hook would help to ensure that you never push broken code in the first place, keeping your CI build history clean and your fellow devs happy :smile:

sds commented 9 years ago

Being able to push broken code is a process problem. You could always get around it by disabling Overcommit, so how useful is it really?

Better to instrument a process where all commits are automatically deployed once they pass tests, so you can remove the human element entirely. However, that's putting a lot of faith in your tests.

If you want to add a pre-push hook, there's no stopping you. :)

pratik60 commented 9 years ago

Very interesting views guys, and I tend to agree with both of you.

The point taken that rspec and jasmine are too long for pre-commit. I also don't want it to be the developer's onus that he runs rspec and jasmine manually himself, or we wait for it before the CI server complains. A pre-push really makes sense that way.

So after git 1.8.2 we do have a git pre push hook. https://github.com/git/git/blob/87c86dd14abe8db7d00b0df5661ef8cf147a72a3/templates/hooks--pre-push.sample

How may we go implementing that, as I don't think overcommit supports it already? Right?

Btw thank you to both of you guys for weighing in on the issue. Your views are helpful and make me understand the problem better.

jawshooah commented 9 years ago

@sds I envision the use of pre-push hooks (and really pre-commit hooks as well) as being for peace of mind rather than enforcement of policy since, as you say, it's easy to just skip the hooks entirely. Some lazy dev could even uninstall Overcommit and bypass the entire process. There's no way to ensure that everyone is using the hooks properly, so really this entire endeavor operates on the honor system.

If you want to add a pre-push hook, there's no stopping you. :)

Count on it!

@pratik60 Until (unless?) Overcommit implements pre-push support, you can just roll your own. Overcommit won't overwrite any hooks that don't conflict with the ones it supports.

pirj commented 7 years ago

you can't apply a stash to a dirty working tree.

Is this still the case? Just tried this:

  cd stash
  git init
  touch one
  git add one
  git commit -m 'one'
  echo two > two
  git add two
  git commit -m 'two'
  echo two2 > two
  git stash
  echo one1 > one
  echo This stash passed
  git stash pop || exit 1
  cat two
  git stash
  echo two3 > two
  git add two
  echo This stash with fail
  git stash pop || exit 1
  echo Unreachable

You cannot pop a stash when working tree contains dirty files that are also in a stash, but this isn't the case, right?

deanrad commented 5 years ago

I know this is an old issue, but it bears re-opening. There's no need to change staged files in the working tree to run tools like linters on only them - just get the list of files that are staged via:

git diff --cached --name-only and pass those down to tools. Or conversely get the list of unstaged files and set them up as an environment variable that scripts can use in exclusion lists. Rewriting the working tree through stash/apply/pop is what a developer should do themselves - it's not something a tool should do for them.

Seeing files flicker in ones editor (at least VSCode is sensitive enough to notice the fs changes, and flicker) is unsettling to say the least. And suffice it to say I came here because i had a fear that after canceling one of my commit hook's runs, I thought overcommit lost my hours of work.

sds commented 5 years ago

A file can have changes that are not in the index, so git diff --cached --name-only will not work in cases where you have applied a partial hunk (see the --patch flag of git add).

The purpose of Overcommit pre-commit hooks is to run tools against the changes that will be committed, not the current contents of the file. Hope that explains the behavior!

deanrad commented 5 years ago

@sds - Thanks - that's a good explanation, but I and others lose work over this feature occasionally (its' spawned a whole set of superstitions around not aborting hooks EVER), and it's the riskiest aspect of this tool IMHO. It's infuriating and stress-inducing when editors flicker due to files being erased and recreated, when you're simply trying to get a yes or no question IS IT GOOD?. Sometimes editor histories and stash pops aren't enough to restore what gets lost, sadly.

Perhaps a simple patch would help: if all changes are staged, overcommit should not do any edit operations on your files.. I'd happily pay the price of explicitly stashing or reverting if it meant this tool would do a read-only operation the 99% of times I am committing every change in my working tree.

sds commented 5 years ago

Hey @deanius, that's a great suggestion!

Supportive of a PR (with tests!) that doesn't touch files/stash if all possible changes/hunks have been staged (i.e. git diff --quiet exits with a zero status code).