Open jykim opened 7 years ago
What do you get as git diff
when you hit the case of allegedly modified notebook files?
I generally see nothing when I run get diff on those files, but I haven't tested them all. (can't share them since it's my work notebooks anyways) As I mentioned, It's especially interesting that these files are gone when I uninstall nbstripout and install it again, which I weird.
Today I found that these changes show up after 'git pull', and I looked at a few other diffs and found that they are diffs between before) notebooks with output after) notebook with output stripped. So maybe the git maybe telling me to strip the output out from some notebooks which contains outputs from previous commits.
Does this make sense? But I still don't understand why these notebooks show up after 'git pull' and why they're gone after uninstall/reinstall.
So what exactly does git status
report in this situation? Are those notebooks modified? Or is it a mode change?
Also could you run a nbstripout --status
before and after the reinstall and report if there's a difference?
@jykim Any update?
This might be related to an issue I posted about here. I still haven't figured it out, so any help is welcome.
I have the same problem: when nbstripout is enabled and I checkout several files, the omitting of changes in these files is not reflected by git status (git status shows their presence, the changes are actually omitted). If I do 'nbstripout --uninstall' for the repo and then do 'nbstripout --install' again then the changes are reflected by git status.
@melaanya, try to change the configuration of the content filters as explained here and it'll help you debugging the issue. You will be able to tell when nbstripout is activated.
@stas00 I've looked at your issue and am about as puzzled as you are. Thanks for documenting it so thoroughly. Guess we'll have to find a better git plumber than I am to explain this behavior.
Currently, pretty much any time one of our developers forgets to enable the filter and commits w/o stripping the notebooks out, it breaks git pull
for anybody who already had a checked out version. And other than doing a new clone the only solution seems to be is to disable the filter, git pull
and reenable the filter.
What we really need is a way to enforce content filters repo-wide, except git doesn't support that due to security reasons (overriding user's config is unsafe). I'm not sure what to do about it. It's a constant problem for us. Perhaps there is a need to re-design content filters so that they aren't configured via a private .gitconfig and can be automatically installed for everybody (note that our repo carries its own version of the nbstripout filter, so the culprit is just not being able to enforce the configuration). Unless perhaps I'm missing something and you have some insights of another way of configuring it, so that the user doesn't need to do anything.
And of course, there are server-side hooks which could do the policing, but it doesn't sound like the best solution. In particular since last I looked github doesn't support those.
Yeah, to me this sounds like a Git issue rather than an nbstripout issue and I'm not sure what (if anything) we could fix in nbstripout to help with this. Have you considered raising the issue on the Git mailing list?
I already did so to no avail.
If it weren't for github and we were running our own git backend then server-side hooks could do the policing, but since github doesn't support that, there is nothing one can do to enforce that.
The best "enforcing" solution I found is to run a CI job that flags this problem as it appears, and hopefully the developers watch for [x]'s in their commit status on github.
echo "Check we are starting with clean git checkout"
if [ -n "$(git status -uno -s)" ]; then echo "git status is not clean"; false; fi
echo "Trying to strip out notebooks"
tools/fastai-nbstripout -d dev_nb/*ipynb dev_nb/*/*ipynb docs_src/*ipynb docs_src/*/*ipynb
echo "Check that strip out was unnecessary"
git status -s # display the status to see which nbs need cleaning up
if [ -n "$(git status -uno -s)" ]; then echo -e "!!! Detected unstripped out notebooks\n!!!Remember to run tools/run-after-git-clone"; false; fi
and the best resolution I found is:
disable the filters
git pull
fix the problems
enable the filters
git commit -a
git push
which in the fastai project is done with a script (-d
is for disable
):
tools/trust-origin-git-config -d
git pull
tools/trust-origin-git-config
FYI, we're discussing adding a dedicated nbstripout
command to help with this flow in #108. Could be worth taking inspiration from the fastai scripts.
I see the same problem, stopped using the tool after 2 days.
@DanielRudnicki Reproduction steps for the failures you're seeing would help.
I think I'm seeing the same issue and may have found a fix. I've been getting (erroneous) complaints for ages about modified ipynb files in (just cloned or pulled) git repositories, and finally decided to track it down. I found that the error occurs when file ~/.config/git/attributes contains the lines
*.ipynb filter=nbstripout
*.zpln filter=nbstripout
*.ipynb diff=ipynb
If I comment out all these lines, the error goes away.
Note that ~/.gitconfig contains the lines
[filter "nbstripout"]
clean = \"/opt/anaconda3/bin/python3\" -m nbstripout
smudge = cat
[diff "ipynb"]
textconv = \"/opt/anaconda3/bin/python3\" -m nbstripout -t
So nbstripout is still installed and runs fine.
Also note that I've encountered the same issue (and the same fix works) on both MacOS and Ubuntu machines.
@rhstanton Are you sure nbstripout
is still active when you comment out the attributes? I'd find that very surprising. How would git know to apply the nbstripout filter to *.ipynb
files?
No, I'm no longer sure! I just tried a small test where I reran some code in a notebook file, and without the attributes, git diff reports that the file has changed, while with the attributes, git diff reports no changes. I guess my "fix" was just to turn off nbstripout - not as interesting as I'd hoped...
OK, thanks for confirming! If you have any further insights on this issue please send them along - this one is still a mystery!
I seem to see changes for notebooks I haven't touched since I installed nbstripout. Interestingly, when I uninstall nbstripout and install it again, the changes are gone. Besides general performance issue, this prevents us from adopting nbstripout as a team.
----(after uninstall nbstripout and install it again)---