twitter / hogan.js

A compiler for the Mustache templating language
http://twitter.github.io/hogan.js
Apache License 2.0
5.14k stars 431 forks source link

Use a more exhaustive encoder #193

Closed tandrewnichols closed 10 years ago

tandrewnichols commented 10 years ago

We've had trouble with some unusually utf8 characters causing parse failures. Would you consider using something like he to encode strings?

sayrer commented 10 years ago

There aren't any known bugs with this. If you provide a test case, we can look into it.

tandrewnichols commented 10 years ago

Here are some special characters that we suspect are failing (the page fails to render, but hogan doesn't report which character, so it's hard to know which one is the problem if a page contains more than one unusual character): 
, ', ’, , and &

tandrewnichols commented 10 years ago

Usually, we get these when people have pasted text in from an external application (e.g. Word). They also, typically, tend to be spacing characters. 
 is one we know for sure causes problems, and it's some sort of line break character. Since javascript can't handle multi line strings, it causes problems.

tandrewnichols commented 10 years ago

The line separator with hex code e280a8 is the problem (after some testing). The other characters listed above work fine once this one is removed.

tandrewnichols commented 10 years ago

Here's a work around we're using that works:

    var str = JSON.stringify(template.data);
    str = str.replace(/\u2028/g, '\\n').replace(/\u2029/g, '\\n');
    template.data = JSON.parse(str);
sayrer commented 10 years ago

We just fixed this in Issue #185. It's already in version 3.0.2

tandrewnichols commented 10 years ago

Ok, thanks.

NatoBoram commented 6 years ago

Is it normal that Hulk will eat my Catégorie in UTF-8 and spill out a Cat├®gorie in UCS-2 LE BOM or UTF-16 or whatever and my computer can't read it?