gettalong / kramdown

kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
http://kramdown.gettalong.org
Other
1.72k stars 271 forks source link

Not all HTML entities are supported #114

Closed tomer closed 10 years ago

tomer commented 10 years ago

I am using Jekyll with kramdown. While redcrapt supported all the HTML entities I've used, I've found that kramdown doesn't support some of my entities, and instead is just escaping the ampersand character.

I'm using a known and documented entities, which should be supported. I guess that if I'll replace the entity keyword with its numerical location it will work, but it is far less readable for who will read the unparsed document.

test.md


---

---
" Hello World π ‏ & & "

_config.yml

markdown: kramdown

result:

<p>" Hello World π &amp;rlm; &amp; &amp; "</p>

Expected result:

<p>" Hello World π &rlm; &amp; &amp; "</p>

(or replace &rlm with U+200F which I'm not really fan of…)

https://gist.github.com/tomer/9703168

gettalong commented 10 years ago

Thanks for reporting this! I will look through the list and add the missing entities for the next release.

siman-man commented 7 years ago

I think there are a lot of entities not yet supported.

In these specifications, https://html.spec.whatwg.org/entities.json is referred to

However, ENTITY_TABLE defined in entities.rb doesn't supported many entities compared with that.


For example

require 'kramdown'

text =<<-HTML
&Abreve; &Acy; &Afr; &Amacr; &And;
HTML

puts Kramdown::Document.new(text).to_html

expect

<p>Ă А 𝔄 Ā ⩓</p>

actual

<p>&amp;Abreve; &amp;Acy; &amp;Afr; &amp;Amacr; &amp;And;</p>
gettalong commented 7 years ago

@siman-man All HTML4 entity references should be supported. However, I didn't find any W3C HTML5 entity reference table. The WhatWG spec you referenced doesn't seem to be official.