commonmark / commonmark-spec

CommonMark spec, with reference implementations in C and JavaScript
http://commonmark.org
Other
4.87k stars 313 forks source link

Inline HTML elements are not wrapped with `<p/>` #492

Closed gfx closed 6 years ago

gfx commented 7 years ago

Problem

There are some inline-level elements, such as <img/>, that are not wrapped with <p/>.

Reproducable Code

input:

<img src="https://example.com/foo.png"/>

got:

<img src="https://example.com/foo.png"/>

expected:

<p><img src="https://example.com/foo.png"/></p>

Comparison: https://johnmacfarlane.net/babelmark2/?normalize=1&text=%3Cimg+src%3D%22https%3A%2F%2Fexample.com%2Ffoo.png%22%2F%3E

Some major markdown parsers, such as markdown.pl, pandoc, and redcarpet, wraps the image tag with <p/>, but commommark.js (and libcmark) does not do so. I think it is a bug.

(p.s.)

Interestingly, <span>foo</span> is warpped with <p/> even in commonmark.js:

https://johnmacfarlane.net/babelmark2/?normalize=1&text=%3Cspan%3Efoo%3C%2Fspan%3E%0A

aidantwoods commented 7 years ago

The given HTML is interpreted as a HTML block (of type 7) defined here: http://spec.commonmark.org/0.28/#html-blocks

Since it is a block, it doesn't get wrapped in <p> tags. This is mainly because it is at the start of the line, and is also the only thing on the line. If text had been included before or after it would be interpreted as inline, or raw HTML.

If you wanted to force it, you could include something else on the line, e.g. <img src="foo" />&nbsp;, or any other text/tags.

gfx commented 7 years ago

Even though it is a spec v0.28, I think the spec is wrong, because:

In addition, because I serve a CMS, I can force no rules to what users write.

jgm commented 7 years ago

+++ FUJI Goro [Aug 24 17 08:16 ]:

In addition, because I serve a CMS, I can force no rules to what users write.

If you allow your users to input raw HTML on the CMS, you're leaving them the possibility of writing bad HTML.

Note also that nothing in the HTML spec requires that an img tag be wrapped in a p.

img can be either "flow content" or "phrasing content."

meteorlxy commented 3 years ago

This looks so inconsistent:

image

https://spec.commonmark.org/dingus/?text=%3Cimg%20src%3D%22foo.png%22%3E%0A%0A%3Cimg%0Asrc%3D%22foo.png%22%3E%0A