apple / swift-openapi-generator

Generate Swift client and server code from an OpenAPI document.
https://swiftpackageindex.com/apple/swift-openapi-generator/documentation
Apache License 2.0
1.23k stars 89 forks source link

Support URL form encoded request bodies #182

Closed benfrearson closed 8 months ago

benfrearson commented 9 months ago

There may be a legitimate reason why this isn't already implemented, or perhaps I'm not using the generator for its intended use case - if so, I'd appreciate some clarity.

It appears to me that URL form encoded requests have their data structure erased in the generated Input.body type, which blindly expects Data.

# openapi.yaml
paths:
  /login:
    post:
      tags:
        - Auth
      summary: Login
      description: Login
      operationId: login
      requestBody:
        required: true
        content:
          application/x-www-form-urlencoded:
            schema:
              type: object
              required:
                - device_name
                - email
                - password
              properties:
                device_name:
                  type: string
                email:
                  type: string
                password:
                  type: string
//Types.swift

//...

/// API operations, with input and output types, generated from `#/paths` in the OpenAPI document.
public enum Operations {
    /// Login
    ///
    /// Login
    ///
    /// - Remark: HTTP `POST /login`.
    /// - Remark: Generated from `#/paths//login/post(login)`.
    public enum login {
        public static let id: String = "login"
        public struct Input: Sendable, Equatable, Hashable {

           // ...

            @frozen public enum Body: Sendable, Equatable, Hashable { case binary(Foundation.Data) }
            public var body: Operations.login.Input.Body
            /// Creates a new `Input`.
            ///
            /// - Parameters:
            ///   - path:
            ///   - query:
            ///   - headers:
            ///   - cookies:
            ///   - body:
            public init(
                path: Operations.login.Input.Path = .init(),
                query: Operations.login.Input.Query = .init(),
                headers: Operations.login.Input.Headers = .init(),
                cookies: Operations.login.Input.Cookies = .init(),
                body: Operations.login.Input.Body
            ) {
                self.path = path
                self.query = query
                self.headers = headers
                self.cookies = cookies
                self.body = body
            }
        }
// ...
}

The issue here is that using the corresponding Client method now is unaware of the required url form fields. This is obviously fairly easy to work around, but it obfuscates the expected data from the users of the Client in a way that using a JSON encoded body does not.

It would be better if the generated Input type could handle encoding form bodies itself, such as by:

I would be happy to contribute to this issue, but I'd like to confirm that this is something that is needed/wanted and aligns with the project's direction.

czechboy0 commented 9 months ago

Hi @benfrearson, welcome to the project! Thanks for taking the time to file the issue with so much great detail!

You're right, Swift OpenAPI Generator doesn't yet support URL encoded form bodies as a structured type, but it's definitely something we'd accept a PR for! It just hasn't been implemented yet as we're still working through other foundational OpenAPI features.

Since adding support for a new kind of a structured body will not be a trivial amount of work, the best way might be for you to try to prototype it (even including some shortcuts) and open a draft PR that we can then discuss the approach in. I don't want you to spend time polishing a PR until we confirm that the overall approach is likely to be accepted.

If this sounds good to you, I'll be looking forward to your PR and if you have any questions, you can ask under this issue.

Thanks! 🙏

czechboy0 commented 9 months ago

Also, this piece of documentation could be helpful, you'll potentially need to expand the set of helper functions: https://swiftpackageindex.com/apple/swift-openapi-generator/0.1.8/documentation/swift-openapi-generator/converting-between-data-and-swift-types

benfrearson commented 9 months ago

Thanks! Yes, I was anticipating using the Converter to handle this. How does this sound as a general approach from the Client interface side:

Assumptions:

Considerations/scope:

I think for a first-pass at this, we should focus on conventional url form key-value pairs while maintaining the existing implementation. I assume this would cover a large percentage of endpoints that use url forms. We could extend this in the future with helpers or options for other encoding types.

czechboy0 commented 9 months ago

Thanks for writing up the assumptions and considerations, that's really the main thing we have to agree on, the implementation should hopefully most flow out of that.

The majority of URL form encoding uses a flat key-value pair structure

That might be the case, but let's first explore what it'd take to support all the payloads that RFC 3986 covers (I'm assuming URL encoded forms with the application/x-www-form-urlencoded content type are using it for the coding rules).

I'd like us to see if we can use the Codable implementations that are already generated for all schemas in the OpenAPI document, and use a custom, URL encoded form-based Encoder/Decoder (similar to how we use JSONEncoder/JSONDecoder with application/json and similar content types today). That way, the generated Codable types wouldn't be coupled to their serialization format, which is the beauty of Codable, and in our case, could scale from just JSON to JSON and URL encoded forms, and possibly in the future to XML (which is also supported by OpenAPI). So I'd prefer if the generated Codable types didn't know anything about the content types used to serialize them.

The benefit of this is that we wouldn't limit support to flag key-value pairs, but we'd support any arbitrary content that can be describes with JSON schema and encoded as application/x-www-form-urlencoded.

If we find this is too difficult to achieve, then we can scale back and talk about only supporting a subset.

Provided schemas may not cover all possible keys, and must allow for arbitrary values

Can you say more about this? Today, you can specify additionalProperties: true in a type: object schema, and we'll generate a dynamic bag of key/value pairs that weren't documented and provide it to you. If you don't specify it in the OpenAPI doc, we only generate a struct that has the documented properties, and throw away any other keys (standard JSON approach to parsing). Do you think this would be sufficient for URL encoded forms as well, or do you think we'd have to diverge from the approach we take with JSON?

Some endpoints may not follow the convention and encode data in an unexpected way. E.g.: nested data types in schema (possibly expecting stringified data)

Not sure what you mean by this, can you provide an example?

repeated key values (e.g. tag=article&tag=sports)

If we use the custom Encoder/Decoder approach, and find multiple values for a non-array type (as for arrays, this would be the standard way to encode it when explode: true), we could either use the first, the last, or throw an error. I think JSONDecoder picks the first/last (I don't remember), but doesn't throw. Whenever unsure, I'd be happy to follow JSONDecoder's approach.

Must provide an interface of known (required and optional) form fields

For sure, hopefully how we generate Codable structs today would be sufficient. It works well for JSON, I don't currently see why it wouldn't be enough for URL encoded forms as well.

Must allow for arbitrary fields not included in the schema

Can you clarify what you mean by "must allow". It could mean: A. safely ignores arbitrary fields, but throws them away during decoding B. safely ignores arbitrary fields, but keeps them around in a private storage property and would encode them again if asked to encode C. safely ignores arbitrary fields, but keeps them around in a public storage property which the adopter can freely read/write from/to

Currently with JSON, we do A when additionalProperties isn't explicitly specified, and C when additionalProperties: true/<schema>. Unless we feel strongly that URL encoded forms are used differently by users, we should just do the same here.

Must not break current implementation (keep the current implicit initializer: .binary(Data))

This isn't a requirement! We're still pre-1.0, and we already have a few breaking API changes queued up behind feature flags that we'll enable unconditionally in 0.2.0 (next breaking version). I think that with support of application/x-www-form-urlencoded bodies, the body enum case should be .urlEncodedForm(<a name of a generated Codable Swift struct here>) (this name is proposed in SOAR-0002, so still subject to change until the proposal is approved). That way, we'd pull URL encoded form bodies to an equal peer to JSON, instead of an add-on that doesn't get all the benefits of type safety etc.

So don't worry about backwards compatibility too much. If it's not-API breaking, it can land in the current minor release. If it's API breaking, we'd land it disabled behind a feature flag and enable in the next minor (API-breaking) release. Either way, the goal here is to make the support as good as possible for adopters, not to retain an API that we are continuing to evolve. 🙂

Should consider unconventional uses of form data (though arguably these are covered by the current implementation) nest data types might have an option to stringify or use dot-syntax to encode

Please elaborate on this as well, ideally with an example, I haven't worked with URL encoded forms very much yet.


To summarize: let's try for putting URL encoded form support on the same level as JSON by trying to build a custom Encoder/Decoder and use the existing generated Codable types. Backwards API compatibility is a non-goal pre-1.0, so let's design the best solution, and whether it's API breaking only affects when exactly it lands.

If all this sounds good to you, I think a good next step would be a prototype of this approach to discover all the unknown unknowns.

benfrearson commented 9 months ago

On the queries about arbitrary field names: I thought it feasible—but probably very bad practice—that you could design an endpoint which would accept any field name, regardless of whether it's included in a schema.

For example, you might have a api/customAttributes endpoint where a user could submit any field names they wanted:

customFieldName1=true&customFieldName2=false

Again, I don't have any concrete examples of this, and it would be fairly odd imo, but I think it would be possible to write an endpoint that did this (however weird)...


We can absolutely aim for parity with JSON support, and that would be preferable in the particular instance that drew me here: I agree that we want as much type safety as possible.

My only hesitation with this approach is that we're making the assumption that every form request is equally as uniform as a JSON object.

I'm happy to move ahead with this approach though and see if it causes any actual issues. The edge cases I suggested are feasible, but perhaps minimal, and it may be good to get the ball rolling with covering the best practices first.

As an example of things assuming a certain conformance in this space, Postman rigidly won't accept anything other than key-value pairs for url forms, so I would assume that means most people don't try and do weird stuff with them:

image

czechboy0 commented 9 months ago

For example, you might have a api/customAttributes endpoint where a user could submit any field names they wanted

That's actually already supported in JSON Schema:

type: object
additionalProperties:
  type: boolean

That will generate you a [String: Bool] in Swift, and will collect the unknown keys which all have Boolean values, like customFieldName1=true&customFieldName2=false.

The only thing to keep in mind is that the author of the OpenAPI document has to explicitly specify it, but I think that's reasonable and so far nobody has complained about that behavior for JSON.

benfrearson commented 9 months ago

ok great! that answers that question then! 👍🏻

czechboy0 commented 9 months ago

Thinking about this more, creating a proper implementation of a URL form encoder and decoder (for Codable types) could have pretty substantial positive consequences for this project, as (with slightly different flags enabled/disabled), it could be used both for application/x-www-form-urlencoded bodies, and for encoding/decoding complex query items and headers themselves, and it's scale to the full support of JSON Schema-representable types, not just individual types we support today (through the _StringConvertible protocol and a bunch of manually written overloads in the runtime library).

To that end, I'd suggest we focus on that problem first - can we build a URLFormEncoder and URLFormDecoder types, which have a few options that allow us to use them both in query items and in bodies (such as whether arrays use indexes or not: key=value1&key=value2 vs key[0]=value1&key[1]=value2, whether spaces are encoded as + or %20, and possibly a few more)?

Once we have a decent encoder and decoder in the runtime library, including unit tests, we can discuss how to a) add support for the new content types in the generator itself (should be straightforward at that point) and b) how to migrate the existing query item and header encoding logic over to this new infra (can be done separately, but should result in a bunch of code removed).

This might take a few PRs, so we could introduce it in the runtime library under @_spi(URLFormCoding) while we work on it, to make sure no adopters start relying on it prematurely, and build it up from there.

I took a brief look at Foundation's JSONEncoder/JSONDecoder implementation and I think they have a good structure we could get inspired by, with the difference that the URLFormEncoder/Decoder should be even simpler, as we really only have 4 fundamental types (string, array, dictionary, and null), whereas JSON has a few more.

How does this approach sound?

czechboy0 commented 9 months ago

Splitting the creation of the new type into a separate issue, to split it from actually supporting the new content type in bodies: https://github.com/apple/swift-openapi-generator/issues/192

bfrearson commented 9 months ago

I've just put up a first pass attempt at implementing a rudimentary version with hardcoded converters, but I fully agree that having a custom encoder is preferable and something that we can work on next. I'm happy to continue with that.

I have a few more thoughts, but will add these to #192

czechboy0 commented 9 months ago

An update from my side: the URI coder will be landing in https://github.com/apple/swift-openapi-runtime/pull/41. Once that lands, I'll move all of the parameters to it (already WIP), deprecate a bunch of code, and we'll release a breaking 0.2.0 hopefully some time next week. After that's released, all the pieces should be in place for you to pick this work up again, and add URL form encoded body support. I'll let you know when it's ready, I'd like to avoid creating a bunch of merge conflicts in on your branches, so it's better to wait.

bfrearson commented 9 months ago

Sweet, thanks! I've been away for the past week, but I'll take a look at this later in the week.

czechboy0 commented 8 months ago

Landed behind the feature flag urlEncodedForm, soon to be released in the next 0.2.x release.

czechboy0 commented 8 months ago

Released in 0.2.3 of the generator, behind the urlEncodedForm feature flag.