inexorabletash / text-encoding

Polyfill for the Encoding Living Standard's API
Other
720 stars 267 forks source link

Encoding then Decoding to ISO-8859-15 error #77

Open sigoofballde opened 6 years ago

sigoofballde commented 6 years ago

I am using the encoding/decoding to test that input data conforms to a set encoding. I have success for the most part, but when I encode a string using 8859-15 and then decode the output of that, the decoded string does not match the original, which it should.

The specific example is trying to test the "Latin Capital Ligature OE" Œ (alt +0140), which is a valid character in the 8859-15 set. the decoded output is "Œ".

My code is as follows:

function validateInputText(mStringToValidate, unicode) { try { var htmlEncoding = appx_session.getProp("rawEncoding"); if (unicode) { htmlEncoding = "utf-8"; } var encodedStr = new TextEncoder(htmlEncoding).encode(mStringToValidate); var decodedStr = new TextDecoder(htmlEncoding).decode(encodedStr); if (decodedStr !== mStringToValidate) { alert("Text not valid for " + htmlEncoding + " encoding. Proceeding could result in data loss."); } } catch (e) { console.log(e); console.log(e.stack) } }

inexorabletash commented 6 years ago

Are you using the NONSTANDARD_allowLegacyEncoding option?

If not, you should be seeing a warning logged to the console that specifying anything other than utf-8 for the encoding will be ignored.

(Unless, of course, you're running in a browser that natively supports TextEncoder in which case the argument will be ignored entirely.)

sigoofballde commented 6 years ago

Great catch, I was not. Unfortunately when I added that the result was the same. And without it I didn't get a console log warning about being ignored.

new code: var encodedStr = new TextEncoder(htmlEncoding, { NONSTANDARD_allowLegacyEncoding: true }).encode(mStringToValidate); var decodedStr = new TextDecoder(htmlEncoding, { NONSTANDARD_allowLegacyEncoding: true }).decode(encodedStr);

sigoofballde commented 6 years ago

Sorry, thought it was fixed earlier, but it's not. It doesn't look like adding the allowlegacy encoding works.

inexorabletash commented 5 years ago

Are you sure the polyfill is being used? If your environment has a native TextEncoder implementation then you'll see that, not the polyfill.