Closed themrxcm closed 4 years ago
Both win1251 and utf8 are supported in iconv-lite. To convert binary data in win1251 encoding to a js string, use iconv.decode(bug, "win1251"). To convert js string to utf8 data, either use iconv.encode(str, "utf8"), or just use the string directly (utf8 is the default encoding and node.js will convert it for you)
On Tue, May 19, 2020, 07:53 Aleshkovskiy notifications@github.com wrote:
Hi! Tell me, please, is it possible to translate from windows-1251 to utf-8 such JSON TEST": { "CODE": 1, "NAME": "����", }, The Atom text editor uses your package and they can do it, but I couldn't find an implementation in their code. Thanks! Sorry for my bad English.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/233, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHPNDJULY6EE2BYX3UDRSJXJ3ANCNFSM4NE5ZZGQ .
My example:
let str = iconv.decode(buff, "windows-1251"); let utf8 = iconv.encode(str, "utf8"); console.log(utf8.toString())
In console:
"TEST": { "CODE": 1, "NAME": "пїЅпїЅпїЅпїЅ", }
It should be:
"TEST": { "CODE": 1, "NAME": "Тест", // Russian language }
Also note that you likely need to decode data before parsing json. If you have already parsed it, it's too late to do decoding.
On Tue, May 19, 2020, 08:24 Alexander Shtuchkin ashtuchkin@gmail.com wrote:
Both win1251 and utf8 are supported in iconv-lite. To convert binary data in win1251 encoding to a js string, use iconv.decode(bug, "win1251"). To convert js string to utf8 data, either use iconv.encode(str, "utf8"), or just use the string directly (utf8 is the default encoding and node.js will convert it for you)
On Tue, May 19, 2020, 07:53 Aleshkovskiy notifications@github.com wrote:
Hi! Tell me, please, is it possible to translate from windows-1251 to utf-8 such JSON TEST": { "CODE": 1, "NAME": "����", }, The Atom text editor uses your package and they can do it, but I couldn't find an implementation in their code. Thanks! Sorry for my bad English.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/233, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHPNDJULY6EE2BYX3UDRSJXJ3ANCNFSM4NE5ZZGQ .
In your example you're encoding utf8 twice. Once using iconv-lite, second time when you do buf.toString(). Try console.log(str)?
On Tue, May 19, 2020, 08:36 Alexander Shtuchkin ashtuchkin@gmail.com wrote:
Also note that you likely need to decode data before parsing json. If you have already parsed it, it's too late to do decoding.
On Tue, May 19, 2020, 08:24 Alexander Shtuchkin ashtuchkin@gmail.com wrote:
Both win1251 and utf8 are supported in iconv-lite. To convert binary data in win1251 encoding to a js string, use iconv.decode(bug, "win1251"). To convert js string to utf8 data, either use iconv.encode(str, "utf8"), or just use the string directly (utf8 is the default encoding and node.js will convert it for you)
On Tue, May 19, 2020, 07:53 Aleshkovskiy notifications@github.com wrote:
Hi! Tell me, please, is it possible to translate from windows-1251 to utf-8 such JSON TEST": { "CODE": 1, "NAME": "����", }, The Atom text editor uses your package and they can do it, but I couldn't find an implementation in their code. Thanks! Sorry for my bad English.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/233, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHPNDJULY6EE2BYX3UDRSJXJ3ANCNFSM4NE5ZZGQ .
Yes, you are right! I may need to specify the encoding when using fs.readFile()
. But of all the possible options there is no suitable one for me ):
I will look for a solution further. The problem appears earlier, but your package is working correctly
console.log(str)
"TEST": { "CODE": 1, "NAME": "пїЅпїЅпїЅпїЅ", }
As I find a solution I will write a comment
Thanks!
Sorry, my inattention failed me. I converted a corrupted file that had a wrong encoding. If you need an example here it is:
const autoenc = require('node-autodetect-utf8-cp1251-cp866');
const fs = require('fs');
const Buffer = require('buffer').Buffer;
const iconv = require('iconv-lite');
convert('./test.json')
function convert(path) {
const data = fs.readFileSync(path);
let buff = new Buffer.from(data);
const encoding = autoenc.detectEncoding(buff).encoding
let str = iconv.decode(buff, encoding || 'utf8');
let utf8 = iconv.encode(str, "utf8");
fs.writeFileSync('./testw.json', utf8)
return utf8;
}
Your suggestion about re encoding helped me a lot! Thanks!
Awesome, glad you resolved the issue!
On Tue, May 19, 2020, 09:20 Aleshkovskiy notifications@github.com wrote:
Sorry, my inattention failed me. I converted a corrupted file that had a wrong encoding. If you need an example here it is:
const autoenc = require('node-autodetect-utf8-cp1251-cp866'); const fs = require('fs'); const Buffer = require('buffer').Buffer; const iconv = require('iconv-lite');
convert('./test.json')
function convert(path) { const data = fs.readFileSync(path); let buff = new Buffer.from(data); const encoding = autoenc.detectEncoding(buff).encoding let str = iconv.decode(buff, encoding || 'utf8'); let utf8 = iconv.encode(str, "utf8"); fs.writeFileSync('./testw.json', utf8) return utf8; }
Your suggestion about re encoding helped me a lot! Thanks!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/233#issuecomment-630812578, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHKOGKNAWF7MCEYJOADRSKBQPANCNFSM4NE5ZZGQ .
Hi! Tell me, please, is it possible to translate from windows-1251 to utf-8 such JSON
TEST": { "CODE": 1, "NAME": "����", },
The Atom text editor uses your package and they can do it, but I couldn't find an implementation in their code. Thanks! Sorry for my bad English.