kynan / nbstripout

strip output from Jupyter and IPython notebooks
Other
1.21k stars 96 forks source link

Support only stripping outputs larger than a given size #135

Closed cjblocker closed 3 years ago

cjblocker commented 4 years ago

I need to be able to strip embedded images and videos from a notebook while leaving the smaller text output (similar to #58). This PR adds a --max-size option which removes output larger than max-size bytes.

$ nbstripout --max-size 5K test1.ipynb
$ nbstripout --max-size 100 test2.ipynb
$ nbstripout --max-size 1M test3.ipynb

This easily delineates larger binary outputs from smaller textual ones.

I did not want to just filter on keys (an alternative) because a matplotlib animation video may be embedded in a text/html key, but I don't want to remove every text/html key. I'm also not sure if embedded images are always image/png in all kernels and plotting backends.

I've run some ad-hoc benchmarks and computing the size of the outputs in negligible to the overall run time when enabled (~2%-4%).

Thanks!

cjblocker commented 3 years ago

@kynan Great, I have added your suggestions. Let me know if there is anything else!