wingman-jr-addon / wingman_jr

This is the official repository (https://github.com/wingman-jr-addon/wingman_jr) for the Wingman Jr. Firefox addon, which filters NSFW images in the browser fully client-side: https://addons.mozilla.org/en-US/firefox/addon/wingman-jr-filter/ Optional DNS-blocking using Cloudflare's 1.1.1.1 for families! Also, check out the blog!
https://wingman-jr.blogspot.com/
Other
35 stars 6 forks source link

Add fallback to utf-8 when character encoding is unspecified and iso-8859-1 encoding fails on a chunk #207

Closed wingman-jr-addon closed 4 weeks ago

wingman-jr-addon commented 4 weeks ago

Note that this is still less than ideal. I looked into the matter further and it appears that essentially you're supposed to default to the encoding based on the locale. In this case it is en so it should be iso-8859-1. However, even in the bug report it appears that LinkedIn usually defaults to utf-8 even though I don't see any of the usual markers to indicate it should switch. Not doing so eventually leads to a 0x92 getting misencoded if present on the page. It seems like this has perhaps changed over the years. For example, consider test 4 here: https://www.w3.org/2006/11/mwbp-tests/test-encoding-4.html This seems to indicate that utf-8 might be the default, but interestingly the test "fails" with vanilla Firefox and Chrome. However, Wingman Jr. "passes" it - but perhaps when it should not.

I'll continue to watch the situation as it may indeed be better to default to utf-8, but for now this fallback still fixes a couple edge cases.