tidwall / gjson

Get JSON values quickly - JSON parser for Go
MIT License
14.31k stars 854 forks source link

GetMany could be more efficient? #175

Open aclowkey opened 4 years ago

aclowkey commented 4 years ago

I'll preface by saying it's a hunch and I have no convincing argument other than my intuition and I'm not familiar with the internals of gjson.

For large JSON's GetMany will iterate through all the entire JSON and will be no more efficient than just calling Get, many, times.

I think there could be some point for optimiziation by checking if the path matches EITHER of the many paths provided. i.e.

msg := `{
   "message": "a hugeee lot of data",
   "a": "b",
   "c": "d",
   "e": "f"
}`
// GetMany will iterate through the GIANT message and stop at "a", 
// then do the same thing for "e", maybe this second pass could be skipped? 
results := gjson.GetMany(msg, "a", "e"); 

Any thoughts?

tidwall commented 4 years ago

Yes, your hunch is right. The API is designed to support a single-pass operation on multiple paths, but at this moment the implementation executes each path serially. A while back I coded it for single-path but then there were some bugs (#47 #54 #55) that cropped up, so I switched it to what it is now. I hope to find more time in the future to bring it to it's full glory.

Yiling-J commented 3 years ago

If someone need a fast GetMany alternative, maybe take a look this one: https://github.com/ohler55/ojg benchmark on 40KB json with 7 search:

cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz

BenchmarkGJsonSearch-16            23752             48151 ns/op
BenchmarkOJGSearch-16             123057              9234 ns/op
franchb commented 2 years ago

@Yiling-J do you have any code samples for your benchmark?

Yiling-J commented 2 years ago

@franchb I don't keep the benchmark code, but it's very easy to write one:

obj, err := oj.ParseString(yourJsonString)
path := []string{path1, path2, path3...}
for _, p := range path {
    x, err := jp.ParseString(p)
    ys := x.Get(obj)
}