Closed Skylion007 closed 4 years ago
Hi @Skylion007 , that is an interesting report! Do we agree that already, running black
on the first input gives you the second?
I see that issues about comment indentation have been reported (and fixed) at https://github.com/psf/black/issues/16 and https://github.com/psf/black/issues/262, but I think here we are in a different case. For instance the cell-marker comment # %%
between print(1)
and print(2)
does not match neither the previous nor the next code paragraph...
Also, I have another question for you: with that code in the Python script, what do you expect to get in Jupyter? AFAIK you cannot break down the if
instruction into multiple code cells and still get it runnable, right?
@mwouts You actually can break down an if statement instruction into multiple code cells and still have it be runnable! Try it, it works in Google Collab at the very least. It's also runnable directly in VSCode.
I've edited the minimal example to more closely mirror my actual code. In the updated code, the first cell should print "1" and the 2nd cell should print "2".
Here is a minimal collab @mwouts
@mwouts I have a workaround for it now, but it's not very elegant.
Also, another issue to raise is that's impossible to escape spaces within a command in --pipe which created a lot of trouble writing the proper sed expression and having it run:
jupytext --pipe-fmt py:percent --pipe black --pipe "sed s/[[:space:]]*\#[[:space:]]\%\%/\#\%\%/g"
Hi @Skylion007 , thanks for the details, I didn't know about these partial cells, that's interesting!
Do you think that allowing indented # %%
markers could help? We could try to indent the cell markers to match the indentation of the first non-empty line in the code cell... Do you think it would solve this issue?
It would help absolutely, but I am not sure if it's part of the 'standard' per say so it should probably be behind a flag/option maybe?
After all, the py:percent format isn't JuPyText specific, not sure if Spyder or other IDEs respect the indented comments. Also, this could be an issue with round trip conversions with other formats, no? What about indented doc strings as text markers etc...
Also, doing it creates some ambiguity of what the 'right' indentation should be. Like if I move the comment further right along with the code block, does all the code respect it's indentation or not? That could be another way you could 'guard' code in a main method and have it be the same between a notebook and a script.
It would help absolutely, but I am not sure if it's part of the 'standard' per say so it should probably be behind a flag/option maybe?
Well, my assumption being that most code cells start with non-indented code, this would concern very few notebooks/cells, so maybe making this optional is not really required. But sure, if you think otherwise, or if we find any occurrence in representative repos like e.g. Python for Data Science, we can make this optional.
the py:percent format isn't JuPyText specific, not sure if Spyder or other IDEs respect the indented comments
I agree. Still, maybe that is not a blocker? In the worst case, having an indented cell marker for the second cell in your example will just mean that some IDE will run the two cells as just one. Maybe the reasons for doing this, which are 1) with the indented cell marker, the jupytext file looks more Pythonic, and 2) it works better with black
, outweight the inconvenience of bluring the cells for the IDE?
what the 'right' indentation should be
Here we are only discussing the indentation for the # %%
marker, right? The content of the code cell should always match that of the Python script, shouldn'it?
I agree. Still, maybe that is not a blocker? In the worst case, having an indented cell marker for the second cell in your example will just mean that some IDE will run the two cells as just one. Maybe the reasons for doing this, which are 1) with the indented cell marker, the jupytext file looks more Pythonic, and 2) it works better with
black
, outweight the inconvenience of bluring the cells for the IDE?
That's a fair argument, actually it probably should be the default. I just tested it and VSCode respects indented comments at the very least so that make me feel better it. The concern is with point 2 though as all my break points are under my if statement so blurring all the cells together make them function as one giant cell.
As for the 'right' indentation, I think you are right as that is a pretty odd an edge case if that's not the case. I can't think of any practical use case where you would want to change the indentation of a cell unless you want to do it for the entire script.
TLDR: On second thought I agree, make it a default option and that should hopefully solve this issue for the Py:Percent format at least.
I found no indented code cell in https://github.com/jakevdp/PythonDataScienceHandbook:
git clone https://github.com/jakevdp/PythonDataScienceHandbook.git
cd PythonDataScienceHandbook
import os
import nbformat
for nb_file in sorted(os.listdir('notebooks')):
if not nb_file.endswith('.ipynb'):
continue
nb = nbformat.read(os.path.join('notebooks', nb_file), as_version=4)
for cell in nb.cells:
if cell.cell_type!="code":
continue
for line in cell.source.splitlines():
# is the first non-empty line indented?
if line.strip():
if line.startswith(' '):
print(cell.source)
break
so changing the indent of the # %%
marker for these indented cells looks reasonable indeed... I'll propose a PR for that soon.
Found another interesting edge case to keep in mind. The following cell works when transferred to colab.
# %%
if __name__ == '__main__':
print(1)
# %%
# INDENTED COMMENT
print(2)
but the following does not:
# %%
if __name__ == '__main__':
print(1)
# %%
# INDENTED COMMENT
print(2)
The extra newline at the beginning of the cell seems to reset the indentation of the entire cell. Just something to keep in mind.
JuPyTer and VSCode however don't seem to care and both cells work fine in both of them. Colab gives an indentation error though.
Hi @Skylion007 , would you mind giving it a try to the branch indented_markers_percent? I think you can simply do
pip install git+https://github.com/mwouts/jupytext.git@indented_markers_percent
@mwouts Yeah, that seems to fix it on my end. At least presuming it rounds trips with --test locally and the outputted notebook appears to be identical.
Here is the example colab btw: https://github.com/facebookresearch/habitat-sim/blob/master/examples/tutorials/colabs/rigid_object_tutorial.ipynb
Also, another issue to raise is that's impossible to escape spaces within a command in --pipe which created a lot of trouble writing the proper sed expression and having it run
Sorry, it took me weeks to reproduce this one... the good news is that I just hit the same issue when working on #553. It should be fixed on the branch pipe_isort
, which I plan to merge soon.
I am having an issue where py:percent notebooks that have cells divided within an if statement get destroyed by black when using --pipe black
gets reformatted into the following, which is not equivalent
For example, when running the following.
jupytext --to notebook test.py jupytext --to py:percent --pipe black test.ipynb
Will cause all cells that contain only indented code to be merged into one giant cell. Are they any formats / workarounds to ensure black does not destroy the scripts like so? Or are there any relevant settings that jupytext can set that will resolve this issue?