Currently Encoding::JS.unescape will raise an EncodingError when it tries to parse Unicode surrogate character pair which often occur in JavaScript strings containing emoji characters. The StringScanner algorithm should be adjusted to identify when the first escaped unicode codepoint starts with \uD0.. , \uD8.., \uD9.., \uDA.., \uDB.., and the second escaped unicode codepoint starts with \uDC.., \uDD.., \uDE.., \uDF...
Currently
Encoding::JS.unescape
will raise an EncodingError when it tries to parse Unicode surrogate character pair which often occur in JavaScript strings containing emoji characters. TheStringScanner
algorithm should be adjusted to identify when the first escaped unicode codepoint starts with\uD0..
,\uD8..
,\uD9..
,\uDA..
,\uDB..
, and the second escaped unicode codepoint starts with\uDC..
,\uDD..
,\uDE..
,\uDF..
.Example
aka '🚀'
Example Solution
https://ruby.social/@nick_evans/112776837324476279