HTML gets double-parsed

urbit / urbit.org

The source for urbit.org

https://urbit.org

MIT License

91 stars 196 forks source link

HTML gets double-parsed #1358

Closed tinnus-napbus closed 2 years ago

tinnus-napbus commented 2 years ago

either just HTML or all markdown in general seems to get double-parsed.

For example, in this doc I had to convert HTML entities into HTML entities, so < is &lt;, in order for it to render correctly. Additionally, HTML gets parsed as HTML even when it's in fenced codeblocks.

tinnus-napbus commented 2 years ago

ok in preformatted codeblocks it only happens when the language is specified as html, so there's something going on with the syntax highlighter maybe

matildepark commented 2 years ago

@tinnus-napbus Curious why you wrote this section in straight DOM; you can write a table in Github Flavoured Markdown and it should avoid this?

tinnus-napbus commented 2 years ago

you cannot include multiple lines in a single cell in a markdown table unless you do \
...\
...\
... which is impractical if it's large.

This problem also occurs in fenced codeblocks with an html language specification. In my example above I have to double entity encode the contents of the html table but even if it were a fenced codeblock I'd still have to entity encode html

tinnus-napbus commented 2 years ago

also \
\
\
wouldn't work for a fenced codeblock I don't think

matildepark commented 2 years ago

I think we're literally using two different parsers in our build for some reason. Need to play around with markdown pipeline here. Can you give me original input for those HTML entities to try out?

matildepark commented 2 years ago

As per #1540 note that Markdoc has its own table spec on top of GitHub tables that was written specifically for rich content like code samples — see here. I hand-ported all our tables and it looks fine on that branch.