google / gson

A Java serialization/deserialization library to convert Java Objects into JSON and back
Apache License 2.0
23.31k stars 4.28k forks source link

TypeAdapter to read "raw" json value #1368

Open tmm1 opened 6 years ago

tmm1 commented 6 years ago

Since #667 JsonWriter#jsonValue() allows emitting a raw json blob when generating json.

I'd like the equivalent for parsing json. In my case, I have large/complex json sub-trees that I would like to avoid parsing to save cpu/allocations. But I still need the value so I can re-create the original json if needed.

Currently I use a typeadapter as so:

class RawJsonAdapter: TypeAdapter<String>() {
    override fun write(out: JsonWriter?, value: String?) {
        out?.jsonValue(value)
    }
    override fun read(reader: JsonReader?): String {
        return JsonParser().parse(reader).toString()
    }
}

However the entire subtree has to be first parsed, and then re-serialized to json before I end up with the raw value.

Is there a more performant way to implement my TypeAdapter#read here?

tmm1 commented 6 years ago

I took another stab at my TypeAdapter:

class RawJsonAdapter: TypeAdapter<String>() {
    override fun write(out: JsonWriter?, value: String?) {
        out?.jsonValue(value)
    }
    override fun read(jsonReader: JsonReader?): String? {
        jsonReader ?: return null
        val writer = StringWriter()
        val jsonWriter = JsonWriter(writer)
        copy(jsonReader, jsonWriter)
        writer.close()
        val buf = writer.toString()
        if (buf.isEmpty())
            return null
        return buf
    }
    private fun copy(reader: JsonReader, writer: JsonWriter) {
        when (reader.peek()) {
            STRING ->
                writer.value(reader.nextString())
            NUMBER ->
                writer.jsonValue(reader.nextString())
            BOOLEAN ->
                writer.value(reader.nextBoolean())
            NULL -> {
                reader.nextNull()
                writer.nullValue()
            }
            BEGIN_ARRAY -> {
                reader.beginArray()
                writer.beginArray()
                while (reader.hasNext()) {
                    copy(reader, writer)
                }
                reader.endArray()
                writer.endArray()
            }
            BEGIN_OBJECT -> {
                reader.beginObject()
                writer.beginObject()
                while (reader.hasNext()) {
                    copy(reader, writer)
                }
                reader.endObject()
                writer.endObject()
            }
            NAME ->
                writer.name(reader.nextName())
            END_DOCUMENT, END_OBJECT, END_ARRAY ->
                throw IllegalArgumentException()
            else ->
                throw IllegalArgumentException()
        }
    }
}
TonyTangAndroid commented 5 years ago

@tmm1 Could you please present a full sample? I tried to copy your code into my test code and it did not work.

oldshensheep commented 1 week ago

this is not only a performance problem!

JsonReader.nextString() return an unescaped string and in JSON both \/ and / are treated as /, so there is no way get the original value if ...

f* json

Marcono1234 commented 5 days ago

JsonReader.nextString() return an unescaped string and in JSON both \/ and / are treated as /, so there is no way get the original value

@oldshensheep, why do you need to know whether in the JSON data / was escaped or not? The represented data by "/", "\/" and even "\u002F" is identical: They all represent a String containing /. In most use cases it should not matter if or how the text was escaped, and any compliant JSON library should return you / as unescaped value.

oldshensheep commented 4 days ago

@Marcono1234 I don't need to get the raw value anymore. I'm implementing a JSON function in another language, and when the input contains a string like "\/", after deserialization it's equal to the string "\/" in that language. So, I thought the language must somehow obtain the raw JSON value, but it turns out it also allows invalid escapes, like JavaScript. 🤣