CyberCRI / learn-ext

WeLearn Browser Extension
https://welearn.cri-paris.org
MIT License
11 stars 1 forks source link

Text encoding mismatch -- random accented characters #36

Open prashnts opened 5 years ago

prashnts commented 5 years ago

Unfortunately, and inevitable, we've finally reached the text encoding bugs! Yay!

In a report from Didier, the single quote character ' was rendered as &#039 (the utf-8 code point for ').

This is a bug, because somewhere in pipeline/transit we lost the original text encoding. Note that windows does not default to using utf-8, but ISO8859 encoding.

In ASCII range, this is not an issue, and since the accent characters are correctly rendered, I can't point out where exactly did we miss the encoding.