kynan / nbstripout

strip output from Jupyter and IPython notebooks
Other
1.2k stars 96 forks source link

nbstripout and git pull fails #108

Open KrisThielemans opened 4 years ago

KrisThielemans commented 4 years ago

Despite nbstripout we still have failures with updating. Scenario: I run a notebook (without changing the actual content), and then want to pull in changes from github, but I cannot.

sirfuser@sirf-gpu-0:~/devel/SIRF-Exercises$ git status
On branch master
Your branch is behind 'origin/master' by 4 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   notebooks/PET/MAPEM.ipynb

no changes added to commit (use "git add" and/or "git commit -a")
sirfuser@sirf-gpu-0:~/devel/SIRF-Exercises$ git diff
sirfuser@sirf-gpu-0:~/devel/SIRF-Exercises$ git pull
Updating 7e101a9..8fedddf
error: Your local changes to the following files would be overwritten by merge:
        notebooks/PET/MAPEM.ipynb
Please commit your changes or stash them before you merge.
Aborting

In this scenario git stash before the git pull sometimes works, although a git stash apply can either say there's nothing to change, or generate conflicts. I have the impression it actually gets rid of any output in the notebook, which is somewhat undesirable but seems unavoidable.

This is somewhat related to https://github.com/kynan/nbstripout/issues/65

casperdcl commented 4 years ago

For very complicated cases I often tend to

nbstripout --uninstall
git stash
git pull
git stash pop
<fix conflicts>
nbstripout --install
KrisThielemans commented 4 years ago

presumably run nbstripout on the file first?

casperdcl commented 4 years ago

no that would mean losing your local output. git stash will save that.

kynan commented 4 years ago

nbstripout can't help with the merge and potential conflicts unfortunately.

One thing we could consider is adding a dedicated nbstripout command (--pull / --update ?) to automate the workflow that @casperdcl describes. Would that be helpful? One of you interested to send a PR for this?

I've had a look if we could automate this using Git hooks, however there's nothing like a "pre pull" hook. There are "post checkout" and "post merge hooks" though. We could potentially use those to ensure nbstripout is re-enabled after this operation.

/cc @stas00

kynan commented 4 years ago

I also found some 3rd party documentation on merging notebooks when using nbstripout. I can't vouch for their correctness or usefulness, but let me know if they help...

kynan commented 4 years ago

@KrisThielemans @casperdcl any more thoughts on this?

casperdcl commented 4 years ago

I'm happy following my proposed workflow as it seems transparent & simple to me.

All of the steps between nbstripout --uninstall and --install are either git or manual conflict-fixing operations, which implies there's nothing more that nbstripout can do, really.

I can't think of a clean way for nbstripout itself to do more or automate things in such a way as to be more friendly to casual users.

KrisThielemans commented 4 years ago

Sadly I have no time to investigate this further. Sorry.

From what I recall, there actually are no conflicts (after applying the filter) in the use case given above. Maybe the behaviour is caused by conflicts before the filter, I don't know.

Apologies that I can't be more helpful.

kynan commented 4 years ago

Thanks both. I'll leave this on the backlog, in case someone comes along and wants to pick this up.

fabian-rudolf commented 2 years ago

For future reference: If the remote branch diverged from your local branch because someone pushed output cells without having nbstripout installed (e.g. a collaborator in a team doesn't have nbstripout installed and pushes), stashing or resetting local changes on a client with nbstripout won't be sufficient, because it is immediately overridden. You have to go all the way currently and run nbstripout --uninstall, git pull (include output cells locally) and nbstripout --install and can then push the stripped version again as mentioned by @casperdcl.

kynan commented 2 years ago

Thanks. As I've argued before I feel supporting this is a bit out of the league of nbstripout. It's a git workflow question and I can't see a why to implement this both cleanly and robustly. I want to avoid adding more brittle and impossible to test code to nbstripout - sorry.

casperdcl commented 2 years ago

probably worth documenting in the readme & closing this issue though

kynan commented 2 years ago

I've mentioned this as a known issue. I'll leave this issue open to (hopefully) prevent too many duplicates being opened.

a18 commented 1 week ago

For very complicated cases I often tend to

nbstripout --uninstall
git stash
git pull
git stash pop
<fix conflicts>
nbstripout --install

Just a small note: if, for some reason, you have nbstripout filters in your global or system git configs, they should be cleared for this workaround to work. One of the possible checks: after "nbstripout --uninstall" run "git config -l | grep strip", the output should be empty.