Open cmungall opened 3 months ago
I added support for converting Jupyter Notebooks on-the-fly to my Bun/Typescript port. I'd love to hear if this is useful.
Update: I added a crazy fast (100x compared to nbconvert
) internal parser to the script (use with --nbconvert internal
). Tested with a medium sized folder of Notebook files and I could not spot obvious problems in the output. But please report any problems with this new feature.
Background: Turns out Jupyter Notebooks are just JSON
and if Bun/Node are good at anything than it's munching through this. The complete, minified script is still ~6,1k bytes btw.
Some numbers:
Directory size:
$ du -sh ~/jupyter/
5,9M /Users/fry/jupyter/
Convert with external nbconvert
:
real 0m10,365s
user 0m6,246s
sys 0m0,836s
Convert with internal parser:
real 0m0,111s
user 0m0,067s
sys 0m0,029s
I frequently want to show LLMs my notebooks as examples of working code. My favorite, claude-3-opus, seems to have no issue with the
.ipynb
format (but haven't rigorously investigated) but this can still be a waste of tokens especially if there are a lot of extraneous images and formatting info.One option would be to use https://nbconvert.readthedocs.io/
to convert to a format like markdown https://nbconvert.readthedocs.io/en/latest/config_options.html#exporter-options
Which is presumably more digestible at least for smaller token length LLMs
This would probably be an extra/plugin to keep the core files-to-prompt lean
A future extension would be to also feed in any img links generated into multimodal LLMs https://github.com/simonw/llm/issues/331