nlohmann / json

JSON for Modern C++
https://json.nlohmann.me
MIT License
41.3k stars 6.58k forks source link

Parsing the unicode string got the wrong result #4272

Open pigLoveRabbit520 opened 5 months ago

pigLoveRabbit520 commented 5 months ago

Description

unicode string "\u7ec4" should get "组" but got "缁"

Reproduction steps

copy latest json.hpp to vs C++ project and write the code below.

Expected vs. actual results

should get but got image

Minimal code example

#include <fstream>
#include <iostream>
#include "json.hpp"

using json = nlohmann::json;

int main()
{
    std::ifstream f("data.json");
    json data = json::parse(f);
    std::string name = data.at("name");
    std::cout << name << std::endl;
    return 0;
}

json data

{
  "name" : "\u7ec4"
}

### Error messages

```Shell
no error

Compiler and operating system

vs 2022 and windows 11

Library version

version 3.11.3

Validation

nlohmann commented 5 months ago
image

I cannot reproduce this.

pigLoveRabbit520 commented 5 months ago

On Linux I also got the right answer. but I am using Windows and visual studio. On windows I got the wrong answer.

nlohmann commented 5 months ago

Can you show the output of data.dump(-1, ' ', true)? This should show \u7ec4, but I want to be sure.

syoyo commented 5 months ago

@pigLoveRabbit520 Confirmed its an issue of your Console codepage. You are using cp936(Simplified Chinese). It prints 组 as 缁. chcp 65001(UTF8) should print 组 as 组. So its not the issue of nlomann json.

pigLoveRabbit520 commented 5 months ago

@syoyo I have changed the codepage to 65001, however, I got the wrong result image

nlohmann commented 5 months ago

Please check my previous comment and post the output of dump

nlohmann commented 5 months ago

(And since I'm neither familiar with Chinese nor Windows - could there be an issue with the font?)