Closed sebastianseilund closed 1 year ago
@sebastianseilund Thank you for your contribution
HTML support is quite clumsy anyway so I am not too shocked by maybe accepting bad HTML, the problem here is however that I would not like to introduce a behaviour which would go away in 1.5 again, potentially breaking ex_doc for some projects if they rely on such bad HTML (probably more by chance than design, but these things happen)
Right now I feel I should reject your PR, but I will give it some thought.
It is indeed sad that I refuse a PR which would allow parsing more but I feel the philosophy: "Be lenient about what you accept, and strict about what you emit" has not worked out very well for HTML of all examples.
Thanks for taking a look so quickly :)
Was it clear from my original message that Earmark crashes with a “no case clause matching” error when encountering this HTML? It’s not that it returns something incorrect, it’s that it doesn’t return at all. I can’t think of a way anyone relies on it to crash.
When converting user provided Markdown to HTML, the only other alternative I would have is to wrap Earmark.as_html in try/rescue, which doesn’t seem right to me.
Let me know what you think. Thanks!
Sorry I did not catch that it crashed, so I'll definitely fix the crash, I'll look again how you did it :)
I have another PR that I merged that I need to verify, and you will be next
Thank you for your work!
Ok finally got it, you just fixed the crash and we already accepted that kind of attributes.
Great job, gonna release tonight!!!
Thanks! :)
I had a real-world Markdown document that looked like this:
Note that the
meta
tag's attributes are not using reqular quotes, but left/right double quotes (“
and”
in HTML speak). The attributes are therefore unquoted.Earmark would raise an error:
This is bad Markdown/HTML, but i think Earmark should handle it anyway.
I narrowed the issue down, and found out that it would fail without a space before the ending
>
in the tag. Something as simple as this would fail:I attempted to fix the issue here by making the unquoted attribute scanning regex stop when encountering a
>
.