simdjson / simdjson-java

A Java version of simdjson, a high-performance JSON parser utilizing SIMD instructions
Apache License 2.0
288 stars 22 forks source link

Structure validation: duplicate key #37

Open GeTOUO opened 10 months ago

GeTOUO commented 10 months ago

If there are duplicate keys in the JSON, the current JSON parser will take the value corresponding to the first occurrence of the key:

inputString:

public class SimdJsonTest {

    static final SimdJsonParser SIMD_PARSER = new SimdJsonParser();
    static final ObjectMapper JACKSON = new ObjectMapper();
    static final Gson GSON = new GsonBuilder().create();

    public static void main(String[] args) throws IOException {

        String json = "{\"num\": 2, \"num\": 123}";
        byte[] bytes = json.getBytes();
        JsonValue value = SIMD_PARSER.parse(bytes, bytes.length);

        System.err.println("[simd].size = " + value.getSize());   // 2
        System.err.println("[simd].num = " + value.get("num"));  // 2

        Map jcsObj = JACKSON.readValue(bytes, Map.class);
        System.err.println("[jackson].size = " + jcsObj.size());   // 1
        System.err.println("[jackson].num = " + jcsObj.get("num"));   // 123

        // will throw exception: com.google.gson.JsonSyntaxException: duplicate key: num
        Map gsonObj = GSON.fromJson(new String(bytes), Map.class);
        System.err.println("[gson].size = " + gsonObj.size());
        System.err.println("[gson].num = " + gsonObj.get("num"));
    }
}

The content of the console output:

[simd].size = 2 [simd].num = 2 [jackson].size = 1 [jackson].num = 123 Exception in thread "main" com.google.gson.JsonSyntaxException: duplicate key: num ...

piotrrzysko commented 9 months ago

Thanks for reporting this. This is definitely something that needs to be documented at least. I'm not sure, though, how this situation should be handled. The JSON RFC doesn't specify this: https://datatracker.ietf.org/doc/html/rfc7159#section-4.