Empty anchor_link_text breaks html exporter

jupyter / nbconvert

Jupyter Notebook Conversion

https://nbconvert.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

1.74k stars 568 forks source link

Empty anchor_link_text breaks html exporter #935

Open akhmerov opened 5 years ago

akhmerov commented 5 years ago

To reproduce on nbconvert 5.4 create a notebook with a single markdown cell that has following contents:

# header

main text

Then execute jupyter nbconvert --to html test.ipynb --HTMLExporter.anchor_link_text=""

Inspect the output and observe that the "main text" is not visible (see image below), and the resulting html is

<h1 id="header">header<a class="anchor-link" href="#header" /></h1><p>main text</p>

I believe the short-form of the a tag is causing the problem.

NB: Is there an alternative way to turn off the anchor links?

jaypeedevlin commented 5 years ago

I have this issue too. A bit of research led me to discover that only void elements are allowed to be in this short form, and <a> is not a void element, which means this is not producing valid HTML.

Looking at https://github.com/jupyter/nbconvert/blob/275f7f909a803560e2459642a539874113fba959/nbconvert/filters/strings.py#L94 I wonder whether checking for an empty string and returning the unmodified html would work here — I don't fully understand the format of what's returned by this function to know whether that would break things or not.

soutogustavo commented 2 years ago

I've got the same problem. However, after converting the jupyter notebook to HTML, I transform the HTML file to PDF. That's where I face the problem with ¶ (pilcrow).

I created a config.py file that contains the following code:

c = get_config()
c.HTMLExporter.anchor_link_text = ' ' # not empty, but a simple space!

Convert the notebook to HTML: jupyter nbconvert --no-input my_notebook.ipynb --to html --config config.py

Then, transform HTML to PDF pandoc my_notebook.html -o my_pdf_notebook.pdf -V fontsize=12pt -V geometry:a3paper -V geometry:margin=1in --metadata=title="" --toc