ashtuchkin / iconv-lite

Convert character encodings in pure javascript.
MIT License
3.04k stars 282 forks source link

Encoding inaccurate #269

Open underdofg opened 2 years ago

underdofg commented 2 years ago

So I've encoded this string

"%0E%04%0E%38%0E%13%0E%04%0E%37%0E%2D%0E%25%0E%39%0E%01%0E%04%0E%49%0E%32%00%20%00%22%00%41%00%49%00%53%00%20%0E%40%0E%0B%0E%40%0E%23%0E%40%0E%19%0E%14%00%20%00%47%00%6F%00%6C%00%64%00%22%00%20%0E%2B%0E%21%0E%32%0E%22%0E%40%0E%25%0E%02%00%20%00%30%00%38%00%39%00%31%00%31%00%39%00%31%00%39%00%39%00%31"

image

This is what i should get from decode function.

But this what I got and the source of decode function has changed too.

image

Example code :

const decodeBody = iconcv.decode(utf-16 string , 'utf16')

ashtuchkin commented 2 years ago

You need to decode the url-encoding first. iconcv.decode(decodeURIComponent(utf-16 string), 'utf16')

On Tue, Jul 20, 2021 at 4:39 AM underdofg @.***> wrote:

So I've encoded this utf-16 string

"%0E%04%0E%38%0E%13%0E%04%0E%37%0E%2D%0E%25%0E%39%0E%01%0E%04%0E%49%0E%32%00%20%00%22%00%41%00%49%00%53%00%20%0E%40%0E%0B%0E%40%0E%23%0E%40%0E%19%0E%14%00%20%00%47%00%6F%00%6C%00%64%00%22%00%20%0E%2B%0E%21%0E%32%0E%22%0E%40%0E%25%0E%02%00%20%00%30%00%38%00%39%00%31%00%31%00%39%00%31%00%39%00%39%00%31"

[image: image] https://user-images.githubusercontent.com/38998878/126289015-6a883288-fe1a-43be-a116-fefe03c2ee5c.png

This is what i should get from decode function.

But this what I got and the source of decode function has changed too.

[image: image] https://user-images.githubusercontent.com/38998878/126289275-8b630dee-6156-4cc3-906a-9b76b7cfdbb5.png

Example code :

const decodeBody = iconcv.decode(utf-16 string , 'utf16')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHLHR4JPCU6G35V5TS3TYUY5HANCNFSM5AVKQDEQ .

ashtuchkin commented 2 years ago

even better - use this code to convert to buffer https://stackoverflow.com/a/57876338/325300, then do the iconv.decode. decodeURIComponent might not always work if the encoded string is not utf-8.

On Tue, Jul 20, 2021 at 11:19 AM Alexander Shtuchkin @.***> wrote:

You need to decode the url-encoding first. iconcv.decode( decodeURIComponent(utf-16 string), 'utf16')

On Tue, Jul 20, 2021 at 4:39 AM underdofg @.***> wrote:

So I've encoded this utf-16 string

"%0E%04%0E%38%0E%13%0E%04%0E%37%0E%2D%0E%25%0E%39%0E%01%0E%04%0E%49%0E%32%00%20%00%22%00%41%00%49%00%53%00%20%0E%40%0E%0B%0E%40%0E%23%0E%40%0E%19%0E%14%00%20%00%47%00%6F%00%6C%00%64%00%22%00%20%0E%2B%0E%21%0E%32%0E%22%0E%40%0E%25%0E%02%00%20%00%30%00%38%00%39%00%31%00%31%00%39%00%31%00%39%00%39%00%31"

[image: image] https://user-images.githubusercontent.com/38998878/126289015-6a883288-fe1a-43be-a116-fefe03c2ee5c.png

This is what i should get from decode function.

But this what I got and the source of decode function has changed too.

[image: image] https://user-images.githubusercontent.com/38998878/126289275-8b630dee-6156-4cc3-906a-9b76b7cfdbb5.png

Example code :

const decodeBody = iconcv.decode(utf-16 string , 'utf16')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHLHR4JPCU6G35V5TS3TYUY5HANCNFSM5AVKQDEQ .

underdofg commented 2 years ago

Thank you, my problem is solved.

On Tue, 20 Jul 2021 at 22:25, Alexander Shtuchkin @.***> wrote:

even better - use this code to convert to buffer https://stackoverflow.com/a/57876338/325300, then do the iconv.decode. decodeURIComponent might not always work if the encoded string is not utf-8.

On Tue, Jul 20, 2021 at 11:19 AM Alexander Shtuchkin @.***> wrote:

You need to decode the url-encoding first. iconcv.decode( decodeURIComponent(utf-16 string), 'utf16')

On Tue, Jul 20, 2021 at 4:39 AM underdofg @.***> wrote:

So I've encoded this utf-16 string

"%0E%04%0E%38%0E%13%0E%04%0E%37%0E%2D%0E%25%0E%39%0E%01%0E%04%0E%49%0E%32%00%20%00%22%00%41%00%49%00%53%00%20%0E%40%0E%0B%0E%40%0E%23%0E%40%0E%19%0E%14%00%20%00%47%00%6F%00%6C%00%64%00%22%00%20%0E%2B%0E%21%0E%32%0E%22%0E%40%0E%25%0E%02%00%20%00%30%00%38%00%39%00%31%00%31%00%39%00%31%00%39%00%39%00%31"

[image: image] < https://user-images.githubusercontent.com/38998878/126289015-6a883288-fe1a-43be-a116-fefe03c2ee5c.png

This is what i should get from decode function.

But this what I got and the source of decode function has changed too.

[image: image] < https://user-images.githubusercontent.com/38998878/126289275-8b630dee-6156-4cc3-906a-9b76b7cfdbb5.png

Example code :

const decodeBody = iconcv.decode(utf-16 string , 'utf16')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/269, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAEZKHLHR4JPCU6G35V5TS3TYUY5HANCNFSM5AVKQDEQ

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/269#issuecomment-883483645, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJJRGXUZXUVA4I2NYED634LTYWIO3ANCNFSM5AVKQDEQ .