minio / simdjson-go

Golang port of simdjson: parsing gigabytes of JSON per second
Apache License 2.0
1.8k stars 85 forks source link

Modifying fields #45

Closed stokito closed 2 years ago

stokito commented 2 years ago

Can I modify a field value in the underlying byte slice if I know that it will have the same length or shorter? I have a field in JSON:

"tmax": 120,

and I need to reduce down on 30ms it before sending the changed JSON further:

"tmax": 90,

So here the value is shorter and takes two digits instead of three and I can just add a space after comma to keep JSON valid and not allocate a new buffer and copy. As far I understood I may try to do this with the raw tape but not sure.

klauspost commented 2 years ago

@stokito It is in theory possible no to modify the values, but such functionality is currently available, since the underlying tape format has absolute offsets modifications will be expensive if values change any length.

Modifying basic values like numbers, bools and even strings (with a bit more work) can be done, but not objects/arrays that change length or most that changes type are infeasible.

You cannot modify the "underlying" JSON, as the parsed values are what is the reference on the tape.

klauspost commented 2 years ago

I have implemented a SetBool command that will replace a bool or nil value. I will extend that for the basic types.

stokito commented 2 years ago

Hi @klauspost thank you. You can leave this and I'll make this for other types. This is a rare case and I feel like you shouldn't spent your time on this. Also I going to send a PR with an example of deep parsing a complicated JSON with many nested fields.

klauspost commented 2 years ago

@stokito It was an interesting direction, so I decided to try it out. It could be interesting for templating.

Bool and Null values are interchangeable since they each take 1 space on the tape.

Strings and numbers (all types) can be exchanged with each-other, since they take 2 space on the tape.

Other types (objects and arrays) are too complex to be interchangeable, so I don't think it is feasible to allow rewriting those. Basically the whole tape will have to be rewritten if any size changes, since it uses absolute offsets.