azavea / noaa-hydro-data

NOAA Phase 2 Hydrological Data Processing
11 stars 3 forks source link

Fix problem with exporting cells with %%time to script format #88

Closed lewfish closed 2 years ago

lewfish commented 2 years ago

To make it easier to review notebooks, we export them to Python scripts using nbautoexport export . in /opt/src/notebooks. This results in a parallel directory structure with the script version of notebooks. For example notebooks version and script version. This fails when a cell contains the %%time magic command. For example, see how this line is not exported in a readable format, but the other lines in the file are. We should try to fix this so those cells are exported in a readable fashion. Perhaps nbautoexport has some setting to deal with this.

vlulla commented 2 years ago

There doesn't seem to be any config setting that I can set to output cell magic correctly! I skimmed the nbautoexport repo and there doesn't seem to be special handling for cell magic. So, I think it might be best to raise this as an issue on the nbautoexport gh page. Is it alright to quote/cite our notebooks, and the generated python scripts, in the issue that I will create on that project? Or, do you want me to create a small reproducible example?

lewfish commented 2 years ago

I would report it as an issue. It be easy to make a small example like the following and it'll make the issue cleaner.

%%time
x = 1
vlulla commented 2 years ago

Filed...issue 101!

vlulla commented 2 years ago

As, I mentioned in the comments on that issue, we ought to refrain from fixing this ourselves because this is more involved than it appears. The complication is that the cell containing the cell magic has actual python code which needs to be run in order to be timed! Jupyter/ipyhon accomplish this by putting the cell code as a python string which will be evaluated. The author/maintainer of the project pointed to a potential workaround on the ipython issues page which we could use. While a neat workaround it makes the python notebook and the exported python script incongruent. Specifically, the cell will run, with timing, in the notebook version (i.e., when the notebook is run interactively) but will not be the case for the exported python script (the timed cell code will be commented lines). That is why, I believe, we ought to let them fix this issue. I have mentioned this observation in the issue comments.

For now, I like @lewfish's idea of just not including any cell magics in python notebooks. Including cell magic is a non-issue if your cell code snippet is small. But, in our benchmarking case there was no way of getting around having a sizable chunk of code in cell which would yield a very long line of python code so it makes sense to not have cell magic in these notebooks!