cholmcc commented 4 years ago

The Feature

I think it would be great if nbconvert could export notebooks to the EPUB or similar format. That would allow a more "book"-like export and makes it friendly to mobile devices.

The current hack

I've hacked around a bit, and the solution I've come up with goes like this

Use nbconvert --to html with a custom template
Run ebook-convert (from Calibre) on generated HTML to make the EPUB

MathJax

The first of all, ebook-convert does not like a script load of MathJax - i.e., the line

    <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS_HTML"></script>

put in by the macro mathjax in mathjax.tpl. If that line is in effect in the converted HTML we will get tons of lines like

INFO: blob:clbr://internal.sandbox/a1fe84df-3dfe-49a2-a50b-215d02400257:43: WARNING: Failed to resolve MathJax file: [MathJax]/jax/input/TeX/jax.js
INFO: blob:clbr://internal.sandbox/a1fe84df-3dfe-49a2-a50b-215d02400257:43: WARNING: Failed to resolve MathJax file: [MathJax]/jax/output/HTML-CSS/jax.js
INFO: blob:clbr://internal.sandbox/a1fe84df-3dfe-49a2-a50b-215d02400257:43: WARNING: Failed to resolve MathJax file: [MathJax]/jax/element/mml/jax.js

when viewing the EPUB with ebook-viewer. Thus, I override the macro by placing in mathjax.tpl in some load path

{%- macro mathjax(url='') -%}
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        tex2jax: {
            inlineMath: [ ['$','$'], ["\\(","\\)"] ],
            displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
            processEscapes: true,
            processEnvironments: true
        },
        displayAlign: 'center',
        "HTML-CSS": {
            styles: {'.MathJax_Display': {"margin": 0} },
            linebreaks: { automatic: true }
        },
        TeX: { equationNumbers: { autoNumber: "AMS" } }
    });
</script>   
{%- endmacro %}

Page layout

nbconvert fixes up the style emitted so that it looks similar to the notebook. This causes some issues with the generated EPUB. I define a custom template epub.tpl with

{%- extends 'full.tpl' -%}

{%- block html_head -%}
{{ super() }}
   <style>
   @media (min-width: 768px) {
    .container {
      width: initial;
    }
  }
  @media (min-width: 992px) {
    .container {
      width: initial;
    }
  }
  @media (min-width: 1200px) {
    .container {
      width: initial;
    }
  }
  body {
    overflow: initial;
    position: initial;
    top: initial;
    left: initial;
    right: initial;
    bottom: initial; }
  #notebook-container {
    overflow: initial;
    box-shadow: none;
    padding: initial;
  }
</style>
{%- endblock html_head -%}

Conversion

I first generate the HTML file with the above templates

jupyter nbconvert --to html --template epub.tpl Test.ipynb  --output=Test.epub.html

and then I use ebook-convert to generate the EPUB

ebook-convert $< $@ \
--embed-all-fonts \
--change-justification justify \
--chapter '//h:h1' \
--page-breaks-before '//h:h1' \
--no-default-epub-cover \
--level1-toc '//h:h1' \
--level2-toc '//h:h2' \
--toc-threshold 0 \
--use-auto-toc \
--epub-version 3

The final EPUB is reasonable. Note, the auto-numbering of equations is broken, as is the use of \ref. Both most likely because ebook-convert generates MathML and embeds that into the generated EPUB.

I attach an example Test.zip

Unpack and do

make view

Yours,

Christian

mgeier commented 4 years ago

As an alternative you can try nbsphinx, but I guess it will have many similar problems. Here's an example EPUB you can check out: https://nbsphinx.readthedocs.io/_/downloads/en/latest/epub/

MSeal commented 4 years ago

If you wanted to put together a PR for the feature, even if it's limited support, I'd be ok with helping review / merge it. This would end up looking a lot like the pdf export where we use latex programs to do the heavy lifting. They also come with some constraints so some export patters don't work, but it's better than not having the option at all. I don't see any problem with adding it as an export option in a similar way. If I or other maintainers can help fix small issues along the way we can at least have a base to start.

cholmcc commented 2 years ago

Hi all,

Coming back to this issue, I decided to go another way.

What I did was to implement an Exporter class which does everything in one pass and uses the standard HTML templates. I attach the code below.

The strategy is as follows.

EPUBExporter derives from HTMLExporter. So the first thing is to convert the NB into HTML using that base class and with the configuration for that base class. That gives us an HTML page to work with.
Next, in particular because EPub does not like Javascript, we need to render the page in a headless browser. I use selenium for that as it is readily available on many platforms (unlike pyppyteer). But, we also need to
- Turn all MathJax into SVG
- Wait until MathJax has rendered all math in the page So I fix up the MathJax configuration and ask it to make a tag in the page once it is done rendering. We can then wait for that tag to appear before we grap the page from the headless browser.
That gives us a new HTML page. This page isn't OK for further processing. In particular, calibre - the EPub encoder used - does not really like CSS variables, and some of the stuff by the templates breaks formatting in EPub readers (see also above). We therefore go through the HTML page and remove external JS links, collect all styles, expand variables, convert all SVGs to embedded images (either SVG or PNG) and so on.
That gives us yet another version of the HTML page which is suitable to pass on to calibre.
For converting to the final EPub, we load the calibre modules we need and execute those on the 3rd version of the page. that gives us the final EPub output which we pass on to the writer in nbconvert.

The exporter tries to follow the ideas of other exporters. For example, if the title or authors are set in the metadata of the NB, then that is propagated to the EPub. The exporter adds a number of specific configurations as well a more generic configuration which allows the user to configure calibre (see the ebook-convert documentation).

Apart from the code of EPUBExporter I also attach an example NB and the result EPub generated with

jupyter nbconvert --to path.to.epub.EPUBExporter --EPUBExporter.publisher="My publishing, Ltd." --no-input NewtonSecondLaw.ipynb

(note to execute the NB you need pip install nbi-stat).

The example and code is in epub.zip since the interface does not allow ipynb and py files I think.

The exporter requires

BeautifulSoup
selenium
cairosvg
calibre
chromium

The code check up front if all this is present before executing

You are welcome to use this if you want.