francoispqt / gojay

high performance JSON encoder/decoder with stream API for Golang
MIT License
2.11k stars 112 forks source link

Encoding/Decoding null values #39

Closed pierrec closed 6 years ago

pierrec commented 6 years ago

Hello,

Currently, there is no way to encode or decode a NULL. Is there a reason for this? Would you mind if it was added?

Thanks.

francoispqt commented 6 years ago

Hi,

So for encoding null, we (my company) haven't found the need as there are some OmitEmpty methods. But we can find a solution.

For decoding, if we want to check null values we use a pointer, which will be nil if value is null (I believe the same solution is valid for encoding/json), example:

type TestObj struct {
    test *string
}

func (t *TestObj) UnmarshalJSONObject(dec *gojay.Decoder, k string) error {
    switch k {
    case "test":
        return dec.String(t.test)
    }
    return nil
}

func (t *TestObj) NKeys() int {
    return 1
}

json := []byte(`{
    "test": null
}`)

testObj := &TestObj{}
_ = gojay.UnmarshalJSONObject(json, testObj)
fmt.Println(testObj.test == nil) // true

I'm ok to add a solution but we should define what we are trying to achieve (encoding null for zero values, or nil pointer, having a nullable type like in SQL).

Also I'm open to PR so if you know what you want and can implement it without affecting performance then I'd be ok. I'm currently working on a code generator for maps, structs and slices which is almost done (writing tests), so would be happy to have some help.

Thanks

pierrec commented 6 years ago

Indeed I am fetching/writing data from/to an SQL database that may have nulls. I will submit a PR for encoding asap. Thanks for the constructive feedback !

francoispqt commented 6 years ago

I will close the issue, let me know if you need help or suggestion.

wvh commented 6 years ago

I've got an array with binary values – a UUID. Since it doesn't make sense to encode this array to a JSON array, I call String() on it and then encode the string representation to JSON; if it is nil, I write a literal null with AddEmbeddedJSON so it's clear that it is not set.

null := []byte("null")
enc.AddEmbeddedJSONKey("uid", (*gojay.EmbeddedJSON)(&null))

Perhaps it would make sense to add a constant for null to gojay itself to prevent having to create a raw value everywhere one is needed:

var JsonNull = EmbeddedJSON("null")

... or even add a new helper type:

func (*Encoder) AddNull() {}
func (*Encoder) AddNullKey() {}

... I suppose there's no point in adding the OmitEmpty variants. :wink:

francoispqt commented 6 years ago

Sure it does make a lot of sense. I prefer the AddNull and AddNullKey solution. You can submit a PR, or I'll add it myself, up to you :) Sorry for the delay I was on holiday in a remote area.

simpajj commented 6 years ago

Hi! Apologies for commenting on a closed issue.

I tried the example above with regards to decoding nil values:

type TestObj struct {
    test *string
}

func (t *TestObj) UnmarshalJSONObject(dec *gojay.Decoder, k string) error {
    switch k {
    case "test":
        return dec.String(t.test)
    }
    return nil
}

func (t *TestObj) NKeys() int {
    return 1
}

json := []byte(`{
    "test": null
}`)

testObj := &TestObj{}
_ = gojay.UnmarshalJSONObject(json, testObj)
fmt.Println(testObj.test == nil) // true

This panics when the value is not null. E.g.:

type TestObj struct {
    test *string
}

func (t *TestObj) UnmarshalJSONObject(dec *gojay.Decoder, k string) error {
    switch k {
    case "test":
        return dec.String(t.test)
    }
    return nil
}

func (t *TestObj) NKeys() int {
    return 1
}
func TestNotNull(t *testing.T) {
    json := []byte(`{
            "test": "someString"
    }`)

    testObj := &TestObj{}
    _ = gojay.UnmarshalJSONObject(json, testObj)
    fmt.Println(testObj.test == nil) // true
}

Is this intended behavior? I'd expect it to set the value if it exists and otherwise set it to nil.

francoispqt commented 6 years ago

Oh you're right, sorry for the misleading solution, I will reopen and come up with a solution soon. If the value needs to be set it tries to set it on a nil pointer which is why it panics.

francoispqt commented 6 years ago

Will add this during the weekend in a new release:

dec.StringNull(&testObj.test /* **string */) 
// passing a pointer to a pointer, this way, if your pointer `testObj.test` is nil, it will remain untouched when `null` is encountered. 
// If a non `null` value is encountered, a pointer to a new string will be assigned. 
// If your pointer is non nil, the value will be assigned to it directly.

Will add it for all types. For non primitive types, it will be slightly different.

simpajj commented 6 years ago

Awesome! Thanks for the quick reply and for this great library!

francoispqt commented 6 years ago

It's done for primitive types, I have merged to master.

Before creating a new release, I need to take care of non primitive types (objects and arrays). For non primitive types I have tested two solutions and I'd like your point of view.

func (o ObjNull) UnmarshalJSONObject(dec gojay.Decoder, k string) error { switch k { case "o": return dec.ObjectNull(&o.O) } return nil }

func (o *ObjNullReflect) NKeys() int { return 1 }


- 2nd solution, without reflection: 
```go
func (o *ObjNull) UnmarshalJSONObject(dec *gojay.Decoder, k string) error {
    switch k {
    case "o":
        return dec.ObjectNull(func() gojay.UnmarshalerJSONObject {
                     o.O = &ObjNull{}
                     return o.O
                })
    }
    return nil
}

Performance: 1st:

pkg: github.com/francoispqt/gojay/benchmarks/decoder
BenchmarkJSONDecodeObjNullReflection-8       2000000           924 ns/op         336 B/op          6 allocs/op

2nd:

pkg: github.com/francoispqt/gojay/benchmarks/decoder
BenchmarkJSONDecodeObjNullFactory-8      2000000           896 ns/op         336 B/op          6 allocs/op

The solution without reflection is slightly faster, but for the case of an Array, without reflection will be even faster. Also, the solution using reflection will panic if the type is not a pointer to a gojay.UnmarshalerJSONObject implementation. On the other hand, the solution with reflection is more consistent with the solution for primitive types.

Let me know which one seems the best to you. Thanks

simpajj commented 6 years ago

Thanks! The solution without reflection seems clearer to me and personally I'd prefer that one. However, it might make sense to value the consistency of the API above all other concerns right now and with that said, maybe the solution using reflection is the right way to go.

francoispqt commented 6 years ago

I'll go for the one using reflection (it's really just a tiny bit of reflection). Will publish a new release tomorrow (2018-08-28) including all the null decoding. Thanks!

francoispqt commented 6 years ago

Merged to master #70