Closed krinsman closed 5 years ago
I'm sure this is not the best way to do it, but e.g. something like:
import nbformat
from nbconvert.exporters.base import export
from nbconvert.exporters.markdown import MarkdownExporter
from nbconvert.filters import convert_pandoc
nbnode = nbformat.read('notebook.ipynb', as_version=nbformat.NO_CONVERT)
markdown_string = str(export(MarkdownExporter, nbnode))
odt_string = convert_pandoc(str(markdown_string), from_format='markdown', to_format='odt')
Something similar to this seemingly should be enough for a "minimum viable product".
Thank you for the feedback and the information you have provided. Our goal for this was to create an export option in the JupyterLab menu for DOCX/ODT. We have taken into account that NBConvert / Pandoc does work for conversion, and we want to actually include that as part of the process in this overall extension. We were thinking about hooking into the Contents API, and handling the conversion there. Thank you for the initial code provided and the information regarding the possible conversion process with NBConvert. We will look into it further and keep you updated!
Oh OK that makes sense.
Yeah in that case then the above probably isn't helpful. To the best of my knowledge the export options currently built into Lab/Notebook use NBConvert, although I don't quite understand the code that does this (which seems to be here as you probably already know):
As far as I can tell the way the built-in Service Manager, which appears to be what is used to connect to nbconvert in order to export the notebooks from the file menu, talks to nbconvert via a REST API:
https://github.com/jupyterlab/jupyterlab/blob/master/packages/services/src/nbconvert/index.ts
Which makes sense actually, since then no server extension is required to run custom Python code (unlike what I suggested above).
The REST API for NBConvert seems undocumented though (as far as I can tell). Also NBConvert doesn't convert directly to DOCX/ODT, so for the conversion from Markdown to DOCX/ODT it seems to mostly be useful for the Pandoc wrapper function which would allow one to use Pandoc for that conversion.
This extension is really useful! It should make Jupyter a lot easier to use for a lot of people.
I vaguely remember you stating during your presentation at UC Berkeley/BIDS that longer term you were looking for how one might convert from notebooks to DOCX and ODT (as well as vice versa).
Is it not already possible to do this though using NBConvert? https://github.com/jupyter/nbconvert/blob/master/nbconvert/filters/pandoc.py
Seemingly the main issue is when either the Markdown or the word processor documents have embedded documents, but apparently this also has a solution.
The function also allows one to add the
--extract-media
flag usingextra-args
, e.g.:pandoc(source, fmt, to, extra_args=['--extract-media=.'])
or using the other convenience methodconvert_pandoc(source, from_format, to_format, extra_args=['--extract-media=.'])
But anyway it should then be fairly straightforward, to convert:
IPYNB --> Markdown
and then using NBConvert's API for Pandoc,Markdown --> ODT/DOCX
Pandoc can also convert ODT or DOCX to Markdown, so it should be possible to go at least halfway in the other direction. According to example 15 here, it is apparently also possible to convert Markdown to IPYNB, but I'm skeptical. At least if one converts from IPYNB to Markdown and then back again, I expect that the resulting notebook will not be the same as the original and have lost several things (e.g. code cells), even when using
--extract-media
. But I haven't had the chance to test this yet, so I don't actually know.