palantir / conjure-go

Conjure generator for Go
Apache License 2.0
7 stars 15 forks source link

verification failure: Clients should error if response is missing required fields #123

Open k-simons opened 4 years ago

k-simons commented 4 years ago

What happened?

What did you want to happen?

nmiyake commented 4 years ago

Yes, this is unfortunate. The actual issue is also more general -- since Conjure treats non-optional fields as "required" generally, this behavior should also be enforced when doing a standard unmarshal from JSON into the Conjure-generated struct.

I propose the following fix:

As an example, for the type:

type Basic struct {
    Data string `json:"data"`
}

The generated unmarshal would be:

func (o *Basic) UnmarshalJSON(b []byte) error {
    basicStrict := struct {
        Data *string `json:"data"`
    }{}
    if err := json.Unmarshal(b, &basicStrict); err != nil {
        return err
    }
    if basicStrict.Data == nil {
        return fmt.Errorf(`required field "data" is missing`)
    }
    o.Data = *basicStrict.Data
    return nil
}
nmiyake commented 4 years ago

One concern for the above would be that if any current code that uses Conjure types implicitly depends on supporting missing values, this will clearly break. Since some code uses Conjure types for serialization (and not just REST APIs), this is more of a risk.

nmiyake commented 4 years ago

@bmoylan for feedback/thoughts on the above.

bmoylan commented 4 years ago

Overall +1. I'm sympathetic to the behavior break but I think we can communicate this out to warn people that required fields are really required.

Some notes on the suggested generated code above:

While we're changing this, it might be worth more correctly implementing wire spec items Forward compatible clients, Servers reject unknown fields, Round-trip of unknown variants if possible, though this probably requires more thought.

k-simons commented 4 years ago

Yup, specifically was looking into https://github.com/palantir/conjure/blob/master/docs/spec/wire.md#42-servers-reject-unknown-fields and https://github.com/palantir/conjure/blob/master/docs/spec/wire.md#41-forward-compatible-clients and the best way to handle this.

Erroring on extra fields is a bit tricky. The simplest thing is to potentially use a map[string]interface{} and then assign the values into a struct. We could also have different "JSON structs" for clients/servers in which this ^^ map idea would only be needed server side

k-simons commented 4 years ago

Ah or we could use https://golang.org/src/encoding/json/stream.go?s=1132:1175#L32

nmiyake commented 4 years ago

Yup that's a more recent addition, but believe that (or something like it) would be the most correct way to handle this.

k-simons commented 4 years ago

Played around with this a bit and came up with:

package pkg

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "testing"

    "github.com/palantir/pkg/safejson"
    "github.com/stretchr/testify/assert"
)

type OriginalConjureStruct struct {
    Foo string  `json:"foo"`
    Bar *string `json:"bar"`
    Baz int     `json:"baz"`
}

type optionalGeneratedOriginalConjureStruct struct {
    Foo *string `json:"foo"`
    Bar *string `json:"bar"`
    Baz *int    `json:"baz"`
}

var allPresent = []byte(`{
  "foo":"foo",
  "bar":"bar",
  "baz":1
}`)

var hasExtra = []byte(`{
  "foo":"foo",
  "bar":"bar",
  "c": "c",
  "baz":1
}`)

var missingRequired = []byte(`{
  "baz":1
}`)

var missingOptional = []byte(`{
  "foo":"foo",
  "baz":1
}`)

func TestServer(t *testing.T) {
    for _, tc := range []struct {
        name     string
        input    []byte
        expected *OriginalConjureStruct
    }{
        {
            name:  "allPresent",
            input: allPresent,
            expected: &OriginalConjureStruct{
                Foo: "foo",
                Bar: str("bar"),
                Baz: 1,
            },
        },
        {
            name:  "missingOptional",
            input: missingOptional,
            expected: &OriginalConjureStruct{
                Foo: "foo",
                Baz: 1,
            },
        },
        {
            name:  "hasExtra",
            input: hasExtra,
        },
        {
            name:  "missingRequired",
            input: missingRequired,
        },
    } {
        t.Run(tc.name, func(t *testing.T) {
            actual, err := functionGenerated(bytes.NewReader(tc.input), true)
            if tc.expected == nil {
                assert.Nil(t, actual)
                assert.Error(t, err)
            } else {
                assert.NoError(t, err)
                assert.Equal(t, actual, tc.expected)
            }
        })
    }
}

func TestClient(t *testing.T) {
    for _, tc := range []struct {
        name     string
        input    []byte
        expected *OriginalConjureStruct
    }{
        {
            name:  "allPresent",
            input: allPresent,
            expected: &OriginalConjureStruct{
                Foo: "foo",
                Bar: str("bar"),
                Baz: 1,
            },
        },
        {
            name:  "missingOptional",
            input: missingOptional,
            expected: &OriginalConjureStruct{
                Foo: "foo",
                Baz: 1,
            },
        },
        {
            name:  "hasExtra",
            input: hasExtra,
            expected: &OriginalConjureStruct{
                Foo: "foo",
                Bar: str("bar"),
                Baz: 1,
            },
        },
        {
            name:  "missingRequired",
            input: missingRequired,
        },
    } {
        t.Run(tc.name, func(t *testing.T) {
            actual, err := functionGenerated(bytes.NewReader(tc.input), false)
            if tc.expected == nil {
                assert.Nil(t, actual)
                assert.Error(t, err)
            } else {
                assert.NoError(t, err)
                assert.Equal(t, actual, tc.expected)
            }
        })
    }
}

// Generated: functionGenerated
func functionGenerated(r io.Reader, strict bool) (*OriginalConjureStruct, error) {
    var optionalA optionalGeneratedOriginalConjureStruct
    dec := getDecoder(r, strict)
    err := dec.Decode(&optionalA)
    if err != nil {
        return nil, err
    }
    if optionalA.Foo == nil {
        return nil, fmt.Errorf(`required field "foo" is missing`)
    }
    if optionalA.Baz == nil {
        return nil, fmt.Errorf(`required field "baz" is missing`)
    }
    return &OriginalConjureStruct{
        Foo: *optionalA.Foo,
        Bar: optionalA.Bar,
        Baz: *optionalA.Baz,
    }, nil
}

// We can either generate this or generate 2 versions of the function above
func getDecoder(r io.Reader, strict bool) *json.Decoder {
    if strict {
        return DecodeStrict(r)
    }
    return safejson.Decoder(r)
}

// This moves into safejson in some way. Similar to safejson.Decoder
func DecodeStrict(r io.Reader) *json.Decoder {
    decoder := safejson.Decoder(r)
    decoder.DisallowUnknownFields()
    return decoder
}

// Just a helper for the test
func str(s string) *string {
    return &s
}

It is a similar thought to what you proposed, however it avoids using a custom Unmarshal function because once you invoke this you will by-pass DisallowUnknownFields. Thoughts?

nmiyake commented 4 years ago

So there are 2 issues here:

  1. Conjure structs should require non-optional fields
  2. Conjure servers should reject Conjure structs with unknown fields

If we want to do (1) correctly, then I think that has to be part of the unmarshal, since we would want to catch this even when no server interactions are required (unless we think this is too risky due to backcompat and just want to enforce this for servers)

For (2), this does need to be on the server side, and agree using the decoder is the best approach.

There is an interesting question of whether if we do (1) in the unmarshal, whether using a custom decoder is honored/propagated through the custom unmarshal. This is something we'd need to investigate -- if that doesn't work, then we may have to use the approach you outlined, or actually hard-code decoder settings as part of the custom unmarshal.

It looks like your proposal above is to handle both (1) and (2) only for arguments provided to server code, and to basically generate the with-optional object in the server. That's certainly a valid approach.

However, would just want to clarify our thought process around non-optional fields for Conjure structs in the context other than Conjure REST calls -- I think it's more correct to provide enforcement that these values can't be missing in JSON, in which case it would make sense to handle (1) as part of the Unmarshal.

k-simons commented 4 years ago

Yes so I am saying that unless you define the custom decoder in the actual Unmarshal function, it will not work. So something like:

func (o *OriginalConjureStruct) UnmarshalJSON(b []byte) error {
    basicStrict := struct {
        Foo *string `json:"foo"`
        Bar *string `json:"bar"`
        Baz *int    `json:"baz"`
    }{}

    decoder := safejson.Decoder(bytes.NewReader(b))
    decoder.DisallowUnknownFields()

    if err := decoder.Decode(&basicStrict); err != nil {
        return err
    }
    if basicStrict.Foo == nil {
        return fmt.Errorf(`required field "data" is missing`)
    }
    if basicStrict.Baz == nil {
        return fmt.Errorf(`required field "data" is missing`)
    }
    o.Foo = *basicStrict.Foo
    o.Baz = *basicStrict.Baz
    o.Bar = basicStrict.Bar
    return nil
}

will work.

But:

func (o *OriginalConjureStruct) UnmarshalJSON(b []byte) error {
    basicStrict := struct {
        Foo *string `json:"foo"`
        Bar *string `json:"bar"`
        Baz *int    `json:"baz"`
    }{}
    if err := json.Unmarshal(b, &basicStrict); err != nil {
        return err
    }
    if basicStrict.Foo == nil {
        return fmt.Errorf(`required field "data" is missing`)
    }
    if basicStrict.Baz == nil {
        return fmt.Errorf(`required field "data" is missing`)
    }
    o.Foo = *basicStrict.Foo
    o.Baz = *basicStrict.Baz
    o.Bar = basicStrict.Bar
    return nil
}

// Generated: functionGenerated
func functionGenerated(r io.Reader, strict bool) (*OriginalConjureStruct, error) {
    var originalConjureStruct OriginalConjureStruct
    dec := getDecoder(r, strict)
    err := dec.Decode(&originalConjureStruct)
    if err != nil {
        return nil, err
    }
    return &originalConjureStruct, nil
}

// We can either generate this or generate 2 versions of the function above
func getDecoder(r io.Reader, strict bool) *json.Decoder {
    if strict {
        return DecodeStrict(r)
    }
    return safejson.Decoder(r)
}

^^ with a strict decoder will not work. I do not want to stick the decoder in the UnmarshalJSON so opted to not override it. We would use similar generated code for unmarshalling client side (minus the strict part)

k-simons commented 4 years ago

Additionally, I have confirmed that overriding UnmarshalJSON will break the current decoder configuration that we use:

func Decoder(r io.Reader) *json.Decoder {
    decoder := json.NewDecoder(r)
    decoder.UseNumber()
    return decoder
}

specifically the UseNumber will not be used

nmiyake commented 4 years ago

Yeah looks like the general issue is that defining a custom Unmarshal function causes all decoder configuration to be ignored, which is unfortunate.

Given that, I'm on-board with the proposal @k-simons posted above that will enforce the "require non-optional fields" behavior on the client and server, and the "reject unknown fields" behavior on server.

The once case this doesn't handle is requiring non-optional fields from an unmarshal performed manually from JSON (for example, if JSON is persisted to disk or encoded and decoded in-memory or across non-Conjure clients), but I think that's an acceptable trade-off (and actually makes the backcompat story easier too)

bmoylan commented 4 years ago

Makes sense to me! I think using safejson in the generated code is probably not worth the indirection and we could drop the dependency and just create the plain decoder with UseNumber as well.

Are we going to have to generate two different structs for the client and server sides due to the divergent UnmarshalJSON behavior, or do we implement two different unmarshal methods on the same struct and have the generated client/server call the right one?

Also it's fine to keep these examples short but want to make sure my concerns about the returned error are handled in the eventual implementation.

Thanks for taking this on!

k-simons commented 4 years ago

@bmoylan yup noted. Will make sure that

  1. We capture all missing fields
  2. Error will be a conjure error

And we should be able to use the same struct. If you look in https://github.com/palantir/conjure-go/issues/123#issuecomment-634917278 (which is the table tests for client/server) they each just call functionGenerated which handles the "missing required fields". The "no extra fields" is then determined by the decoder. You'll notice that hasExtra passes for client but fails for server

k-simons commented 4 years ago

@bmoylan @nmiyake have been playing around with this a bit, the UnmarshalJSON side looks fine but MarshalJSON is proving a bit tricky:

type TopLevel2 struct {
    StrArg                 string   `json:"strArg"`
    StrOptionalArg   *string `json:"strOptionalArg"`
}
func TestBigSad(t *testing.T) {
    myTopLevelArg := TopLevel2{}
    a, _ := json.Marshal(myTopLevelArg)
    fmt.Println(string(a))
}
{"strArg":"","strOptionalArg":null}

When we Marshal fields with empty values the keys are preserved, which when we receive these bytes, we will see these keys and assume they were set/present with empty values.

You could add the omitempty tags but then it makes it impossible to actually pass in the null values (0/“”/false).

Currently the only solution that I have came up with is to

  1. Change all non-collection fields to be *$field as opposed to $field. So everything is a pointer.
  2. Add the omit empty tags

This forces object creation to actually set the fields they need to. There becomes some ambiguity for optionals, a required string field and an optional string field would each be *string, with the server/client doing the validation that the required string is set. However, I'm not sure if this actually matters... you could make it more formal by generating Optional like wrapper structs in this case which would make it more explicit, but I'm not sure that is better.

Thoughts?