endojs / endo

Endo is a distributed secure JavaScript sandbox, based on SES
Apache License 2.0
828 stars 72 forks source link

Canonicalization of UTF-16 with TextEncoder/TextDecoder #691

Open kriskowal opened 3 years ago

kriskowal commented 3 years ago

I encountered a strange behavior in syrup, where fuzzing the codec with random strings did not round-trip the data. Starting with a string of random length where every character in that string was a number in the full two-byte range, encoding these strings with TextEncoder, then decoding the resulting bytes with TextDecoder, produced a different string than the original. This might be expected behavior, not not behavior I understand.

It is likely this occurs when the randomly generated string corresponds to invalid sequences of surrogate pairs.

kriskowal commented 3 years ago

Ref #684