kynan / nbstripout

strip output from Jupyter and IPython notebooks
Other
1.19k stars 95 forks source link

Pre-commit hook fails when a notebook is only partially staged #121

Closed nix1 closed 1 year ago

nix1 commented 4 years ago

Hi everyone,

The pre-commit nbstripout hook fails for me when a notebook if only partially staged. I often don't add all the changes in notebooks to avoid commiting mess in the metadata section. I can't pinpoint the exact issue, because there is no actual error message (just a Failed), so perhaps someone could at least point me to what should I look into?

How to reproduce:

  1. Configure nbstripout with pre-commit (pre-commit.com)
  2. Make changes in a notebook. The changes would include modified cell content and modified metadata (eg. switch a kernel)
  3. Use git add -p to only add cell content, and not changed metadata.
  4. Try to commit, git commit

The result of the above usually - but not always - is like:


nbstripout...............................................................Failed
- hook id: nbstripout
- files were modified by this hook
[INFO] Restored changes from /home/paul/.cache/pre-commit/patch1585071858.
nix1 commented 4 years ago

Essentially, it seems than nbstripout modifies files while pre-commit doesn't expect it to do it, see:

https://github.com/pre-commit/pre-commit/blob/34e9d117861ba758b3fd2d0b33eb70de9f05238d/pre_commit/commands/run.py#L197

And I ran into the same issue without partial staging of files, so these are not necessarily related.

kynan commented 4 years ago

Yes, that seems to be the intended behaviour of pre-commit: the idea is that the hook will fail if it would be changing the content since that means that the notebook hasn't been stripped. pre-commit can only be used for hooks that perform checks, not for hooks that automatically apply fixes.

I'm not using pre-commit myself, so I can't tell if there's a way to configure it to have the behaviour you want.

kynan commented 4 years ago

@nix1 any update on this? Feel free to close if you consider this resolved.

VikashKothary commented 4 years ago

Hi, I've got this same issue. With other similar pre-commit hooks, they fail once because it modified the files once. But the second time, it should pass because no changes should be needed. This does not seem to be the case.

I tried using the --dry-run flag instead to check for changes without changing the files, however this too flags all files as needing changes every time regardless of whether they have outputs.

kynan commented 4 years ago

I think you'd need to stage all the changes the hook made before running it again. Otherwise it'll fail again in the same way.

kynan commented 3 years ago

Any update on this @nix1 @VikashKothary ? I've tried reproducing this behavior with a simple test:

  1. commit stripped notebook with 1 code cell
  2. add another code cell, run both cells
  3. use git add -p to only stage the new cell 2 (with output); output of cell 1 remains unstaged
  4. try to commit, which fails (expected):
    [WARNING] Unstaged files detected.
    [INFO] Stashing unstaged files to /home/frg/.cache/pre-commit/patch1618181311.
    nbstripout...............................................................Failed
    - hook id: nbstripout
    - files were modified by this hook
    [INFO] Restored changes from /home/frg/.cache/pre-commit/patch1618181311.
  5. use git add -p to stage stripping the output for cell 2; output of cell 1 remains unstaged
  6. try to commit, which works:
    [WARNING] Unstaged files detected.
    [INFO] Stashing unstaged files to /home/frg/.cache/pre-commit/patch1618181413.
    nbstripout...............................................................Passed
    [INFO] Restored changes from /home/frg/.cache/pre-commit/patch1618181413.
    [main 1c40ed0] foo
    1 file changed, 3 insertions(+), 1 deletion(-)
  7. git diff still shows the output for cell 1, as expected

To me this is exactly the behavior I'd expect. Do you agree?

Note that I've tested using pre-commit 2.12.0 and I'm not sure if the auto stashing of unstaged changes is a new feature.

kynan commented 1 year ago

@nix1 @VikashKothary is this still an issue for you?

VikashKothary commented 1 year ago

Hi @kynan, I don't remember which project I was using when I experienced this issue. So I'm happy for this to be closed. And if I run into this issue again, I'll re-open it with more specific steps regarding how to reproduce this issue.

Does that sound good?

kynan commented 1 year ago

Sounds great, thanks!