Encoding then Decoding to ISO-8859-15 error

sigoofballde commented 6 years ago

I am using the encoding/decoding to test that input data conforms to a set encoding. I have success for the most part, but when I encode a string using 8859-15 and then decode the output of that, the decoded string does not match the original, which it should.

The specific example is trying to test the "Latin Capital Ligature OE" Œ (alt +0140), which is a valid character in the 8859-15 set. the decoded output is "Å".

My code is as follows:

inexorabletash commented 6 years ago

Are you using the NONSTANDARD_allowLegacyEncoding option?

If not, you should be seeing a warning logged to the console that specifying anything other than utf-8 for the encoding will be ignored.

(Unless, of course, you're running in a browser that natively supports TextEncoder in which case the argument will be ignored entirely.)

sigoofballde commented 6 years ago

Great catch, I was not. Unfortunately when I added that the result was the same. And without it I didn't get a console log warning about being ignored.

new code: var encodedStr = new TextEncoder(htmlEncoding, { NONSTANDARD_allowLegacyEncoding: true }).encode(mStringToValidate); var decodedStr = new TextDecoder(htmlEncoding, { NONSTANDARD_allowLegacyEncoding: true }).decode(encodedStr);

sigoofballde commented 6 years ago

Sorry, thought it was fixed earlier, but it's not. It doesn't look like adding the allowlegacy encoding works.

inexorabletash commented 5 years ago

Are you sure the polyfill is being used? If your environment has a native TextEncoder implementation then you'll see that, not the polyfill.

inexorabletash / text-encoding

Encoding then Decoding to ISO-8859-15 error #77