gettalong / kramdown

kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
http://kramdown.gettalong.org
Other
1.72k stars 271 forks source link

Anchor tags around block elements cause mangled HTML #579

Closed rbuchberger closed 5 years ago

rbuchberger commented 5 years ago

Problem: Wrapping a block level element in an anchor tag results in mangled HTML.

This came to light in this issue with jekyll-picture-tag.

I've created a small test project to replicate it:

# Gemfile

source 'https://rubygems.org'
gem 'kramdown', '~>2'
# test.rb
require 'kramdown'

puts Kramdown::Document.new(File.read('./test.md')).to_html

When test.md looks like this:

<a href="example.com" markdown="0"><div markdown="0"></div></a>

The output looks like this:

<p><a href="example.com">&lt;div markdown="0"&gt;&lt;/div&gt;</a></p>

Adding or removing markdown="0" doesn't seem to have any effect. This problem shows up in a number of cases:

<!-- input: -->
<a href="example.com">
  <picture>
    <source srcset="image.webp 1.0x, image2.webp 1.5x" type="image/webp" />
    <source srcset="image.jpg 1.0x, image2.jpg 1.5x" type="image/jpeg" />
    <img alt="alt text" src="fallback.jpg" />
  </picture>
</a>

<!-- output: -->
<p><a href="example.com"></a></p>
<picture>
    <source srcset="image.webp 1.0x, image2.webp 1.5x" type="image/webp" />
    <source srcset="image.jpg 1.0x, image2.jpg 1.5x" type="image/jpeg" />
    <img alt="alt text" src="fallback.jpg" />
  </picture>
<p>&lt;/a&gt;</p>
<!-- input: -->
[<div markdown="0"></div>](example.com)

<!-- output: -->
<p><a href="example.com">&lt;div markdown="0"&gt;&lt;/div&gt;</a></p>

If you wrap the whole thing in a div, you don't get the same issue:

<!-- input: -->
<div><a href="example.com"><div></div></a></div>
<!-- output: -->
<div><a href="example.com"><div></div></a></div>
gettalong commented 5 years ago

The reason for this is that the <div> tag is not an inline/span level tag but a block level tag. Therefore it can't be nested inside span level elements like anchors.

Also, kramdown has to decide what to do with HTML elements. If it encounters a span level HTML tag like <a>, it assumes that the tag starts a paragraph. Therefore the <a>...</a> line gets wrapped in a <p> tag.

So if you want to use HTML elements in a way that contradicts the conceptual model of kramdown, you have to tell kramdown to ignore those lines using the {::nomarkdown} extension:

{::nomarkdown}
<a href="example.com" markdown="0"><div markdown="0"></div></a>
{:/}
rbuchberger commented 5 years ago

This is the solution I was looking for, thanks a ton. Sorry I couldn't find it in the docs.