jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.74k stars 569 forks source link

html tables in markdown are not rendered in latex #241

Open amueller opened 8 years ago

amueller commented 8 years ago

All <table> formatting seems completely lost, and each cell is surrounded by newlines.

amueller commented 8 years ago

Hm, I guess html is generally punted? converting to rst inserts a whole bunch of ..raw: html lines (maybe it should just add one?). Is this out of scope? Should I use markdown instead?

takluyver commented 8 years ago

I think it's whatever pandoc does. I'm a bit surprised that it doesn't make an effort to convert HTML tables to Latex, though.

amueller commented 8 years ago

I had a pretty trivial table. Using markdown it creates a nice booktabs thing in the pdf, using html it creates the content of each cell one by one in a line. So maybe this is a pandoc issue?

This was not my main issue, but it is a bit inconvenient for pandas dataframes, which by default outputs as html. I just tried

from IPython.display import Latex
Latex(data.to_latex())

but that didn't result in anything useful (the table was not rendered)

amueller commented 8 years ago

From the pandoc docs:

The raw HTML is passed through unchanged in HTML, S5, Slidy, Slideous, DZSlides, EPUB, Markdown, and Textile output, and suppressed in other formats.

this seems relevant

amueller commented 8 years ago

I guess this is somehow related to the possible nesting of markdown and html?

amueller commented 8 years ago

Can you give me a point to where pandoc is invoked?

takluyver commented 8 years ago

Have a look here: https://github.com/jupyter/nbconvert/blob/master/nbconvert/filters/markdown.py https://github.com/jupyter/nbconvert/blob/master/nbconvert/utils/pandoc.py

jakobgager commented 8 years ago

Some time ago, we tried to get both raw html and raw latex be processed when converting to latex, see ipython/ipython#3503. However, this got rather complicated and was finally not included.

amueller commented 8 years ago

hm good to know. It seems the old conversation was mostly about formulas, not tables, though, right? Not sure if that makes any difference. I feel that pandas dataframes are a really common object, and not being able to display them seems not great. Also, I was just really really surprised that there was a difference in rendering a html table and a markdown table. They look the same in the notebook (I guess because the markdown is converted to html) but markdown is exported sensibly, while markdown is not (looking at the rst generated for a simple html table is not great)

amueller commented 8 years ago

Oh, I just realized something else happens do pandas dataframes. Doe they output different stuff depending on the backend?

amueller commented 8 years ago

also images if included according to the docs are not rendered because the img tag is stripped. Is there a way to specify the size of an image using markdown [I often use width=100% for example]? It looks to me at the moment that I can either use html and specify the size and not have it render in pdf, or use markdown and not be able to specify the size. Or should the default way to include an image be latex? That also seems odd...

jankatins commented 8 years ago

Oh, I just realized something else happens do pandas dataframes. Doe they output different stuff depending on the backend?

Without diving to deeply in here: pandoc has some funny idea what is code and what is html and pandas pretty printing tables was one of the cases where pandoc assumed it was code and not a table. I ended up removing all spaces at the front of the line: https://github.com/JanSchulz/knitpy/blob/master/knitpy/documents.py#L337

amueller commented 7 years ago

For dataframes in particular, it would be helpful to embed the latex data in the notebook, which currently doesn't happen. It looks like _repr_latex_ exists but returns None?

Solution: pd.set_option("display.latex.repr", True)

kenaycock commented 6 years ago

For dataframes in particular, it would be helpful to embed the latex data in the notebook, which currently doesn't happen. It looks like _repr_latex_ exists but returns None?

Solution: pd.set_option("display.latex.repr", True)

Any idea why display.latex.repr is not True by default?

t-makaro commented 6 years ago

@kenaycock It conflicted with qtconsole. See https://github.com/pandas-dev/pandas/issues/12182

Also: pd.set_option("display.latex.longtable", True) may be useful to people. It'll make sure that pandas tables page break properly in the conversion.

kkmann commented 5 years ago

would be great to have something like knitr result='asis' option, tables could then be output to markdown and be truly generic

firasm commented 4 years ago

I'm having trouble with this solution here, the styled data frame shows up as:

<pandas.io.formats.style.Styler at 0x1202eed90>

I set the option as: pd.set_option("display.latex.repr", True)

Screen Shot 2020-01-25 at 12 14 23 AM

Any advice?

m2rik commented 4 years ago

I'm having trouble with this solution here, the styled data frame shows up as:

<pandas.io.formats.style.Styler at 0x1202eed90>

I set the option as: pd.set_option("display.latex.repr", True)

Screen Shot 2020-01-25 at 12 14 23 AM

Any advice?

Hey did you solve this by any chance?

timkpaine commented 4 years ago

not a solution by any means but some folks might find it helpful: https://github.com/timkpaine/nbconvert_templates/blob/formalize/nbcx_templates/utils/utils.py#L34

Screen Shot 2020-04-15 at 2 23 52 PM

ghuname commented 4 years ago

Hey did you solve this by any chance?

I have the same problem. Is there a known workaround?