danielgtaylor / huma

Huma REST/HTTP API Framework for Golang with OpenAPI 3.1
https://huma.rocks/
MIT License
2k stars 147 forks source link

[question]: Validation for string-based types in Enum has unexpected results. #600

Open superstas opened 1 week ago

superstas commented 1 week ago

Hi there,

I have the following types

main.go

const (
    NameTypeFoo NameType = "FOO"
    NameTypeBar NameType = "BAR"
)

type (
    NameType          string
    UserCreateRequest struct {
        Name NameType `json:"name" minLength:"1" maxLength:"32"`
    }

    CreateUserInput struct {
        Body UserCreateRequest
    }

    CreateUserOutput struct {
        Body struct {
            Message string `json:"message"`
        }
    }
)

var _ huma.SchemaTransformer = NameType("")

func (nt NameType) TransformSchema(r huma.Registry, s *huma.Schema) *huma.Schema {
    s.Enum = []interface{}{NameTypeFoo, NameTypeBar}
    s.Default = NameTypeFoo
    s.PrecomputeMessages()
    return s
}

func addRoutes(api huma.API) {
    huma.Register(api, huma.Operation{
        OperationID: "CreateUser",
        Method:      http.MethodPost,
        Path:        "/user",
    }, func(ctx context.Context, input *CreateUserInput) (*CreateUserOutput, error) {
        resp := &CreateUserOutput{}
        resp.Body.Message = "CreateUser works!"
        return resp, nil
    })
}

I have two string-type-based constants: NameTypeFoo and NameTypeBar ( FOO and BAR respectively ).

I set Enum in TransformSchema by s.Enum = []interface{}{NameTypeFoo, NameTypeBar}.

After that, I get the correct OAS

spec ```yaml components: schemas: CreateUserOutputBody: additionalProperties: false properties: $schema: description: A URL to the JSON Schema for this object. examples: - https://example.com/schemas/CreateUserOutputBody.json format: uri readOnly: true type: string message: type: string required: - message type: object ErrorDetail: additionalProperties: false properties: location: description: Where the error occurred, e.g. 'body.items[3].tags' or 'path.thing-id' type: string message: description: Error message text type: string value: description: The value at the given location type: object ErrorModel: additionalProperties: false properties: $schema: description: A URL to the JSON Schema for this object. examples: - https://example.com/schemas/ErrorModel.json format: uri readOnly: true type: string detail: description: A human-readable explanation specific to this occurrence of the problem. examples: - Property foo is required but is missing. type: string errors: description: Optional list of individual error details items: $ref: "#/components/schemas/ErrorDetail" type: - array - "null" instance: description: A URI reference that identifies the specific occurrence of the problem. examples: - https://example.com/error-log/abc123 format: uri type: string status: description: HTTP status code examples: - 400 format: int64 type: integer title: description: A short, human-readable summary of the problem type. This value should not change between occurrences of the error. examples: - Bad Request type: string type: default: about:blank description: A URI reference to human-readable documentation for the error. examples: - https://example.com/errors/example format: uri type: string type: object UserCreateRequest: additionalProperties: false properties: $schema: description: A URL to the JSON Schema for this object. examples: - https://example.com/schemas/UserCreateRequest.json format: uri readOnly: true type: string name: default: FOO enum: - FOO - BAR maxLength: 32 minLength: 1 type: string required: - name type: object info: title: My API version: 1.0.0 openapi: 3.1.0 paths: /user: post: operationId: CreateUser requestBody: content: application/json: schema: $ref: "#/components/schemas/UserCreateRequest" required: true responses: "200": content: application/json: schema: $ref: "#/components/schemas/CreateUserOutputBody" description: OK default: content: application/problem+json: schema: $ref: "#/components/schemas/ErrorModel" description: Error ```

where enum is correct

name:
          default: FOO
          enum:
            - FOO
            - BAR
          maxLength: 32
          minLength: 1
          type: string

I expect the request with FOO or BAR to pass the validation.

The issue

main_test.go

func TestUserCreate(t *testing.T) {
    _, api := humatest.New(t, huma.DefaultConfig("Test API", "1.0.0"))
    addRoutes(api)

    _ = api.Post("/user", map[string]any{"name": "FOO"})
}

I get a 422 error with the correct FOO ( or BAR ) value on the request.

    main_test.go:14: Making request:
        POST /user HTTP/1.1
        Content-Type: application/json

        {
          "name": "FOO"
        }
    main_test.go:14: Got response:
        HTTP/1.1 422 Unprocessable Entity
        Connection: close
        Content-Type: application/problem+json

        {
          "$schema": "https:///schemas/ErrorModel.json",
          "title": "Unprocessable Entity",
          "status": 422,
          "detail": "validation failed",
          "errors": [
            {
              "message": "expected value to be one of \"FOO, BAR\"",
              "location": "body.name",
              "value": "FOO"
            }
          ]
        }
--- PASS: TestUserCreate (0.00s)

My question is it a bug or a wrong way to set Enum in TransformSchema? If I set it as strings s.Enum = []interface{}{"FOO", "BAR"} everything works as expected.

Thank you.

danielgtaylor commented 1 week ago

@superstas this is an interesting problem you ran into! First, you are very close to the fix. I think something like this should work:

s.Enum = []interface{}{string(NameTypeFoo), string(NameTypeBar)}

Now for the explanation and what we could do about it. When a request comes in to be validated it actually gets unmarshaled into any (typically a map[string]any but could be others too). The validation code runs on that and returns any errors, then if there are no errors it will unmarshal a second time into your Go input struct. Benchmarks show that this is actually faster than other ways of loading the data and it lets the validation code not need slow reflection calls to traverse your structs for most validation checks, making it super fast and low/zero allocation.

So on that first pass, you have a map[string]any{"name": "FOO"} and the validator runs the enum check to compare the string value FOO from your input against NameTypeFoo and NameTypeBar which are NameType. Since the types don't match, the equality check fails. Here's a simple example of that:

type NameType string

const NameTypeFoo NameType = "FOO"

func main() {
    // Note: value is `any` otherwise the compiler will see the mismatched types,
    // we need to trigger the check at runtime instead.
    var value any = "FOO"
    fmt.Println(value == NameTypeFoo) // prints false
}

https://go.dev/play/p/0pwZcS0Wy01

One possible way to fix this is to check if they are the same type, and if not, try to cast one to the other. In this case since NameTypes underlying type is string, the input can be cast for comparison. This would also work well for various numerical types as they can all be cast to each other.

The problem though is that check is a bit hard to do and slow (and may require reflection), so honestly I would rather users create enums using the expected incoming type (in this case string from the JSON) or we find a way to panic and alert the user this will probably not work as expected before startup. What do you think? Is it worth trying to dig into this further?

superstas commented 1 week ago

First of all, thank you, Daniel, for the quick reply and a very detailed explanation.

What do you think? Is it worth trying to dig into this further?

As for the digging into it. From DX's perspective, IMO, all these options are valid:

Since I expect the value of NameType in OAS to be a string type, I don't overthink whether the enum values are strings or string-based.

From DX's perspective, something like s.Enum.AddValue() would probably fix this issue by performing the casting you described above.

From the implementation point of view, TransformSchema is being executed on every call: ModelValidator.Validate -> registry.Schema -> registry.SchemaFromType -> TransformSchema ( if implemented ), which is costly.

I don't see a good and reasonable solution for now. Mentioning this possible issue in the Doc/GoDoc is probably the best option for now (until the number of similar problems/questions from users increases).