Closed wingman-jr-addon closed 4 weeks ago
Ok, so I think I figured out what triggered this? LinkedIn declares as an HTML 5 document, but does not set character encoding via charset, meta, etc. In this case I believe it is generally the locale plus heuristics that define the use of the encoding, which I believe would usually fall back to iso-8859-1/Windows-1252. However, that fails encoding and causes mojibake. Catching that specific scenario and only temporarily falling back to utf-8 on a per chunk basis looks like it resolves the issue.
Test code is on branch https://github.com/wingman-jr-addon/wingman_jr/tree/fallback-to-utf8
Fixed somewhat by #207 at least enough for now.
LinkedIn on 3.3.6: LinkedIn on 3.4.0:
Note that the dot no longer translates. This seems similar to #199 but that specific case didn't seem to have regressed.