Open yhojann-cl opened 4 years ago
I depure the source code of CryptoJS.enc.Utf8
in code.js
:
var Utf8 = C_enc.Utf8 = {
stringify: function (wordArray) {
try {
return decodeURIComponent(escape(Latin1.stringify(wordArray)));
Ok, lets trace using the depuration tab from firefox (developer tool):
stringify: function (wordArray) {
...
return latin1Chars.join('');
The value of latin1Chars
is \u0080
, this works fine, but the problem is translate to ut8 using decodeURIComponent
:
decodeURIComponent('7F')
// "\u007f"
decodeURIComponent('80')
// URIError: malformed URI sequence
encodeURIComponent('\x80')
// "%C2%80"
You can write a function to return the plain text without the decodeURIComponent(escape())
?.
Solve this using this.CryptoJS.enc.Latin1
, but Latin1
is not unicode, latin1
use a translation of 1 to 1 byte representation, but you can add the equivalent variable to the function, like as var Unicode = C_enc.Unicode = C_enc.Latin1;
. By example, can not encode & decode the decimal character representation of 300 Ĭ
(0xc4ac
or 0x012c
):
escape(String.fromCharCode(300));
// "%u012c"
CryptoJS.enc.Hex.stringify(CryptoJS.enc.Latin1.parse('Ĭ'));
// 2c
CryptoJS.enc.Hex.stringify(CryptoJS.enc.Latin1.parse(String.fromCharCode(300)));
// 2c
console.log(this.CryptoJS.enc.Hex.parse('2c').toString(this.CryptoJS.enc.Latin1).charCodeAt());
// 44 (,)
console.log('\u012c');
// Ĭ
The UTF-8 supports from
00
toFF
as unicode values, but CryptoJS.enc.Utf8 support only a valid ascii character from00
to7F
.
FF
is 255
.
The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that valid ASCII text is valid UTF-8-encoded Unicode as well.
https://en.wikipedia.org/wiki/UTF-8
The Unicode char with the code 128
in UTF-8 is 2 bytes – 194, 128
. Just 128
is not valid UTF-8.
new TextEncoder("utf8").encode(String.fromCharCode(128)) // Uint8Array(2) [194, 128]
let a = CryptoJS.enc.Hex.parse("0080")
console.log(a); // WordArray { words: [ 8388608 ], sigBytes: 2 }
a = CryptoJS.enc.Utf16.stringify(a)
console.log(a); //
console.log(a.charCodeAt(0)); // 128
For example, from hex representation to unicode string:
But in native javascript code i can representate the
\x80
in unicode string format:The UTF-8 supports from
00
toFF
as unicode values, butCryptoJS.enc.Utf8
support only a validascii
character from00
to7F
.Please, add a unicode option for decode plain strings, like as: