aredridel / html5

Event-driven HTML5 Parser in Javascript
http://dinhe.net/~aredridel/projects/js/html5/
MIT License
590 stars 168 forks source link

Parsing of   entities (and others)? #6

Closed EmilStenstrom closed 13 years ago

EmilStenstrom commented 13 years ago
var html5 = require('html5');
var parser = new html5.Parser();
parser.parse("<p>&ndash;&thinsp;Om inget görs åt utsläppen...</p>");
console.log(parser.document.innerHTML)
// Expected: <html><head></head><body><p>– Om inget görs åt utsläppen...</p></body></html>
// Actual:   <html><head></head><body><p>– Om inget görs åt utsläppen...</p></body></html>
EmilStenstrom commented 13 years ago

Hm... This looks good when pasted into a browser like this. Maybe it was just an error of my terminal, that it couldn't show that character?

EmilStenstrom commented 13 years ago

Yeah, it was. That character doesn't show in the Cygwin terminal, I take everything back :)

aredridel commented 13 years ago

Indeed. UTF-8 in terminals can be dodgy -- MacOS gets it right, GNOME too mostly. Everything else, be careful.