mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.
https://mths.be/he
MIT License
3.43k stars 255 forks source link

HTML Entity Decoder should require trailing semicolon #8

Closed FremyCompany closed 11 years ago

FremyCompany commented 11 years ago

Is it by design that '&ampersand' is conveter to '&ersand'? If my intuition is correct, it should not.

mathiasbynens commented 11 years ago

Thanks for taking the time to file issues! I really appreciate it.

The example you give is correct as per the HTML algorithm. Try it in your browser: data:text/html;charset=utf-8,&ampersand.

Some named character references don’t require a trailing semicolon. Such HTML is invalid, but browsers must parse it this way as per the HTML spec, and he aims to be fully spec-compliant and compatible with browsers.

For more info, read about ambiguous ampersands in HTML: http://mathiasbynens.be/notes/ambiguous-ampersands

FremyCompany commented 11 years ago

Wow. This is crazy. Thanks for the notes, I learnt a few things reading them!