[EXPIRED] Cross-cultural compatibility from the box

Dear Isaac, console.log(base32.decode(base32.encode("love"))); works as expected, it returns love, hooray.

But let's step aside the anglosaxonian world and type something in Cyrillic. For example, equivalent of word love is любовь in Russian, yet console.log(base32.decode(base32.encode("любовь"))); returns garbled piece of text ?OªµÏ \Õ?Uk¹²L.

There are various intermediate solutions to this issue. I like the following two:

console.log(decodeURIComponent(escape(base32.decode(base32.encode(unescape(encodeURIComponent("любовь"))))))); returns любовь
console.log(UTF8.decode(base32.decode(base32.encode(UTF8.encode("любовь"))))); returns любовь

var UTF8 = {
    encode: function(string) {
        string = string.replace(/\r\n/g, "\n");
        var utftext = "";
        for (var n = 0; n < string.length; n++) {
            var c = string.charCodeAt(n);
            if (c < 128) {
                utftext += String.fromCharCode(c);
            } else if ((c > 127) && (c < 2048)) {
                utftext += String.fromCharCode((c >> 6) | 192);
                utftext += String.fromCharCode((c & 63) | 128);
            } else {
                utftext += String.fromCharCode((c >> 12) | 224);
                utftext += String.fromCharCode(((c >> 6) & 63) | 128);
                utftext += String.fromCharCode((c & 63) | 128);
            }
        }
        return utftext;
    },
    decode: function(utftext) {
        var string = "",
            i = 0,
            c = 0,
            c2 = 0;
        while (i < utftext.length) {
            c = utftext.charCodeAt(i);
            if (c < 128) {
                string += String.fromCharCode(c);
                i++;
            } else if ((c > 191) && (c < 224)) {
                c2 = utftext.charCodeAt(i + 1);
                string += String.fromCharCode(((c & 31) << 6) | (c2 & 63));
                i += 2;
            } else {
                c2 = utftext.charCodeAt(i + 1);
                c3 = utftext.charCodeAt(i + 2);
                string += String.fromCharCode(((c & 15) << 12) | ((c2 & 63) << 6) | (c3 & 63));
                i += 3;
            }
        }
        return string;
    }
};

Also, there's Encoding API.

const txtencoder = new TextEncoder;
message = "любовь";
txtencoder.encode(message); // returns UTF8 Uint8Array

Could you be so kind and make it working from the box without those additional helpers so users around the globe could enjoy your library without a hassle? Developer of Base91's implementation already did it.

agnoster / base32-js

[EXPIRED] Cross-cultural compatibility from the box #10