kynan / nbstripout

strip output from Jupyter and IPython notebooks
Other
1.19k stars 94 forks source link

Should be agnosting on trailing blank lines #160

Closed joaonc closed 2 years ago

joaonc commented 2 years ago

nbstripout is opinionated on not having a trailing blank line, so if the only difference is that there's a blank line at the end, nbstripout will remove it.

This causes conflicts on editors that have options to add/remove blank lines at the end of files (almost all nowadays) and causes the pre-commit action to always fail b/c my editor adds a line on auto-save and then nbstripout removes it.

nbstripout should either have an option to add / not add a blank line at the end of the file or detect what's there before running and leave the same way.

image

kynan commented 2 years ago

It's not nbstripout being opinionated here, it's nbformat, as that's what we're using to write the notebook to file. Scanning the nbformat implementation I can't find a flag that would append a newline. I'd happily accept a pull request if you'd like to work on this @joaonc ?

kynan commented 2 years ago

This is unfortunately also not as simple as it may sound: nbstripout tries it utmost to not modify anything about the notebook except the output. In particular it tries to preserve system-specific newline characters. Thus when appending an extra newline at the end you need to make sure to pick the "right one".

maciejb commented 2 years ago

This issue seems to pose an additional challenge with PyCharm when nbstripout is used as a pre-commit hook. The two fight back and forth adding and removing a blank line, making it impossible to do the commit within their GUI.

kynan commented 2 years ago

That is annoying :( @maciejb is PyCharm insisting on a trailing newline with no option to turn this off?

joaonc commented 2 years ago

PyCharm does have an option to turn on/off EOF cleanup (see image in the description of this issue). However, other tools such as black, some pre-commit hooks and other linters also add the new line at EOF and that's pretty standard.

Would be great if nbstribpout would do one of:

On another note, I would love to work on this, but alas don't have the time 😞. Will keep it in the back of my mind in case time comes up.

kynan commented 2 years ago

I understand the pain, however as explained this really is an nbformat issue, not an nbstripout issue and as such I have no intention to address this in nbstripout. Sorry!

maciejb commented 2 years ago

PyCharm is actually replacing the ending with NUL for *.ipynb no matter what the editor settings are. See this issue with JetBrains. Definitely their problem and not nbstripout or nbformat's; it equally causes unnecessary diffs between editing the file in Jupyter vs in PyCharm.

I don't think it's fair to expect nbformat to change anything either. Fundamentally, I think the issue goes all the way to how standard json library works, and that those newlines at EOF are gone the moment nbformat loads the json object. You'd have to make a very special effort to preserve anything outside of the json object contained in the ipynb file.

>>> import json
>>> x = json.loads('{"hello":"world"}\n\n\n\n\n')
>>> json.dumps(x)
'{"hello": "world"}'
kynan commented 2 years ago

Thanks for adding that perspective @maciejb, agree 💯

kynan commented 2 years ago

Based on @maciejb last comment I'll close this as wontfix. Please reopen if new information emerges.