mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.
https://mths.be/he
MIT License
3.45k stars 254 forks source link

should \u2026 be encoding into hellip not mldr? #38

Closed revelt closed 9 years ago

revelt commented 9 years ago

gentlemen, \u2026 is being decoded correctly both from hellip and mldr named entities. However, when encoding, \u2026 is encoded into mldr. I traversed quite few Unicode reference pages and everywhere the default named entity is referenced as hellip. Please consider switching the encoding to hellip named entity because it is easier to remember and matches majority of reference sources on the Internet. thank you

mathiasbynens commented 9 years ago

he prefers short named references: https://github.com/mathiasbynens/he/blob/86eff45a3e1a337c75950cc430da3e9918e06165/scripts/process-data.js#L24 This is by design.

revelt commented 8 years ago

@mathiasbynens — a quick update, fyi

Apparently, unlike …, the … is not supported on quite few email software clients, including Outlook on Windows and various 3rd party mail apps such as Airmail for Mac. I'm using he.js as a dependency on detergent.js where we encode all special characters as part of text preparation. I recently did a new detergent.js release to swap … with … after he.js encoding. We were getting rendering defects from ….

mathiasbynens commented 8 years ago

@revelt Thanks for the info!

I’d recommend leaving the useNamedReferences setting turned off: https://github.com/mathiasbynens/he#usenamedreferences The README mentions this:

Note that if compatibility with older browsers is a concern, this option should remain disabled.

revelt commented 8 years ago

@mathiasbynens

That's good in theory, but in practice email developers often have to be responsible for the final delivery, including copy, what means duty of reading the encoded text in the HTML and understanding it. Numeric entities are unrecognisable, that's why I convert to named-ones as much as possible...