jupyter / jupyter_markdown

Documentation and tests related to Jupyter's Markdown syntax
BSD 3-Clause "New" or "Revised" License
8 stars 10 forks source link

Automatically obtain the notebook html as it would be rendered in browser with marked #8

Open mpacer opened 6 years ago

mpacer commented 6 years ago

This may require the use of a browser emulating library (such as selenium) or actually creating a Chrome instance that will save the resulting file to disk.

This is going to be hard and I'm not sure how to implement it off the top of my head, but it would be great if we can figure it out.

Related to #7

blink1073 commented 6 years ago

Obtaining the raw html (not rendered) should be doable by installing marked and a using highlighter: https://github.com/chjj/marked#highlight, and using a setup like this.

blink1073 commented 6 years ago

(from node)

mpacer commented 6 years ago

My concern is that I want to use the exact notebook code paths and not marked directly. In order to have mathjax work nicely, we do some weird yanking and replacing text in classic notebook. That shouldn't mess with whitespace, but it might. It's also not the only thing that we do (e.g., preprocess headers to add standardised ids).

Also, by using the notebook code directly we will be setting ourselves up for being consistently "in sync" with whatever the latest release of the notebook is and will allow us to see how notebook changes will affect these tests without needing to call marked separately.

Also, I'm not sure what you mean by getting the not rendered html.

We want the html generated from rendering markdown cells. It is actually better if it is the rendered DOM object rather than the source text since if the rendered DOM object is different from the source raw html, we're going to run into inconsistencies between how the notebook looks to users and what our tests think it looks like to users. I may misunderstand your point though… so all that I'm saying there might be irrelevant.

blink1073 commented 6 years ago

Fair enough, I thought you might be looking for a html document as an artifact, and not necessarily a DOM structure.