tidwall / gjson

Get JSON values quickly - JSON parser for Go
MIT License
14.31k stars 854 forks source link

Should ParseBytes use stringBytes() ? #188

Closed rickb777 closed 3 years ago

rickb777 commented 4 years ago

ParseBytes is documented as being preferred over Parse(string(json)) but all it does is call Parse(string(json)).

Perhaps you meant to call Parse(stringBytes(json)). This would mean not copying the byte array so it would be quicker, especially for large JSON inputs.

tidwall commented 3 years ago

I think that Parse(string(json)) is the correct way to go because the Parse operation pretty much does two things: left-trims off the space of the json, and then determines its type. The result json, which is exactly the same as the original input (minus some leading whitespace), is then assigned to the Raw string member of the Result type. So one way or another the input []byte must be copied to the Result.Raw string. Since both will be nearly identical, using the Go's builtin []byte to string casting is optimal.

This makes ParseBytes(data) a mere convenience method over Parse(string(data)). Thus, I should probably remove the comment:

// If working with bytes, this method preferred over Parse(string(data))
rickb777 commented 3 years ago

Yes you're right.

Your stringBytes function 'cheats' by using the unsafe package; I've seen this done elsewhere and it's a reasonable thing to do provided you API doesn't leak this to its callers.

But if part of the input string is assigned to Raw without there having been a copy, then the original []byte could be changed and affect the Raw values by side-effect. Not good. It needs to copy the data because otherwise the immutable semantics of string could be subverted.

So what you've got is safe because it involves a copy.

(btw Go doesn't have casting)