encodeURIComponent throws an exception

when the user enters invalid unicode characters (such as U+DFFF), the function throws an exception with the following message:

For example

string contained an illegal UTF-16 sequence

Taking the programmatic approach to discover the answer, the only range that turned up any problems was \ud800-\udfff, the range for high and low surrogates

So, if you want to take the easy route and block surrogates, it is just a matter of:

urlPart = urlPart.replace(/[\ud800-\udfff]/g, '');

If you want to strip out unmatched (invalid) surrogates while allowing surrogate pairs (which are legitimate sequences but the characters are rarely ever needed), you can do the following:

function stripUnmatchedSurrogates (str) {
    return str.replace(/[\uD800-\uDBFF](?![\uDC00-\uDFFF])/g, '').split('').reverse().join('').replace(/[\uDC00-\uDFFF](?![\uD800-\uDBFF])/g, '').split('').reverse().join('');
}

reference

jingxinxin / tiankeng

encodeURIComponent throws an exception #15