nodejs / nan

Native Abstractions for Node.js
MIT License
3.27k stars 501 forks source link

Nan::Utf8String encoding problem #927

Closed gaetandezeiraud closed 2 years ago

gaetandezeiraud commented 2 years ago

I have currently a problem with encoding on an Open Source project (see https://github.com/Brouilles/bsdiff-node/issues/21). This concern chinesses characters and also "é" "ë" and similar.

After getting the argument in Utf8String and converted it to a std::string. If I log the std::string (for example oldFile variable). C:\Users\Gaëtan Dezeiraud become C:\Users\Ga├½tan Dezeiraud. And I don't understand why. So this result to an invalid path for open/fopen.

In JS, const oldFile = path.join(__dirname, 'resources/react-0.3-stable.zip'); to define the path. If I log the variable in the JS code, no encoding problem.

The code is here https://github.com/Brouilles/bsdiff-node/blob/ebed4bec871d140576f96c8e975e3a9615f53f36/src/Main.cpp#L24 If you have an idea.

kkoopa commented 2 years ago

The referenced issue specifically mentions Windows. I would guess that you probably have to convert the path to utf-16 and use the appropriate wide variant to open the file referenced by the path. In the end, it all comes down to the OS locale.

gaetandezeiraud commented 2 years ago

Thanks for the answer. I just saw that by trying to print a unicode character. I have the same problem. So it doesn't seem to be related to Nan::Utf8String. Rather on the side of node-gyp. I tried to add UNICODE, _UNICODE in the binding.gyp but not success.