UUID type support - Githubissues

gentges commented 2 months ago

Motivation

While designing an API recently I decided to utilise this OpenAPI generator for the first time. During the process of connecting the generated code to my existing data models and server logic I stumbled across the conversion between String and UUID. Since UUID is a type I regularly use and that conforms to Codable I wondered why the generated code was not making use of the Swift type and ignored the format specifier I supplied in my openapi document.

As of my understanding the OpenAPI specification supports all types and format specifiers defined in the JSON Schema Specification. The latter includes a format for uuid. The support of native UUID types would remove the need to convert between String and UUID. Thus boilerplate code for handling the conversion and potential failures would be removed from the implementation side.

Modifications

UUID was added to the builtin types and the type matcher was updated accordingly. Also the constants for the file import where updated to import Foundation.UUID.

Result

When specifying the uuid format in an openapi doc the generated Swift code will utilise the Foundation UUID type in the generated code.

my_property:
  type: string
  format: uuid

var myProperty: Foundation.UUID

Test Plan

The type matcher test was updated to include a check for uuid. The reference files were updated to include the import statement for Foundation.UUID and use UUID for the request UUID. Therefore the PetstoreConsumerTests were updated to send and assert such request uuid. Updated the petstore definition and reference tests to use uuid for the response id as well.

simonjbeaumont commented 2 months ago

gentges commented 2 months ago

@czechboy0 Took me a bit to find some time to work on this.

I have added the feature flag as requested, please have a look at. I looked at older PRs and tried to do something similar, though UUID being a builtin type had me adjust some calls around the type matching in order to pass down the feature flag status.

czechboy0 commented 1 month ago

Hi @gentges,

thanks, this looks great. But before we land it, I think we should make it easier to pass config options (such as feature flags, needed in your case) down into TypeAssigner/TypeMatcher, which will be a little invasive, but will make similar changes much cleaner (and would allow your PR to also produce a much smaller diff).

This is something that's come up a few times recently, so I think this is the time to do it.

Let me investigate that next week and get back to you here.

czechboy0 commented 1 month ago

Ok, once #640 lands, you should be able to update your PR on top of it, and hopefully it'll result in much less change.

The idea would be that you move the enableUUIDSupport flag onto the new TranslatorContext, which is already propagated to all the right places.

czechboy0 commented 1 month ago

Ok @gentges now you can update from main and add a Bool property on the new TranslatorContext, should hopefully make things easier.

czechboy0 commented 1 month ago

@simonjbeaumont this PR is basically ready to land according to the original plan of just having a feature flag that switches between the old and new behavior.

You mentioned we should explore if there's a way to avoid the feature flag. Do you have ideas here? I think it's important to keep in mind that when using the UUID type, we want to catch any malformed strings during deserialization, rather than at access time (eg if we had an extra computed property). But maybe that can differ between old vs new behavior, and part of this might still be opt in.

gentges commented 1 month ago

Hi @czechboy0,

I fixed the UUID strings as requested. The addition of the TranslatorContext really simplified the implementation of the feature flags. 👍🏻

simonjbeaumont commented 1 month ago

You mentioned we should explore if there's a way to avoid the feature flag. Do you have ideas here? I think it's important to keep in mind that when using the UUID type, we want to catch any malformed strings during deserialization, rather than at access time (eg if we had an extra computed property). But maybe that can differ between old vs new behavior, and part of this might still be opt in.

I wasn't thinking we would offer a computed property for the UUID, but rather a computed property for the string, for backwards compatibility and the feature flag could opt folks into using the UUID only.

The idea here is that we can capture the passive adopters attention somehow, but providing them a backwards compatible change with instructions on how to adopt it and move to the typsafe-only world.

For example:

// what we have today
struct S {
  var u: String
}

// proposed: without feature flag
struct S {
  var uUUID: UUID  // need to think about the naming here

  @available(*, deprecated, ...)  // use this message to encourage people to cut-over by adding the feature flag
  var u: String { uUUID.uuidString }
}

// proposed: with feature flag
struct S {
  var u: UUID
}

czechboy0 commented 1 month ago

Oh, I like that. A few more thoughts:

This would technically change runtime behavior, as if an invalid UUID string was received, now we'd fail parsing - but I think that's good, as if the OpenAPI doc constrains a field, we should be able to enforce that constraint without a feature flag. (Do you agree? Does anyone disagree here?)
We'd need to duplicate the initializer of any struct with a UUID property, so that we have both a non-UUID initializer, and a UUID initializer. With the feature flag, we can generate only the UUID initializer.
Elephant in the room: since this backwards compatibility dance will be non-trivial, should we try to batch the UUID change with other type-safe scalars, such as URL? Any other we have in mind? It'd allow us to only do this once, "before typesafe scalars" and "after typesafe scalars", instead of doing this for each new type, which would then explode the number of initializers we need to generate for backwards compatibility.
Finally - this now sounds involved enough to warrant more scrutiny, I think we need a proposal with a prototype. @gentges, would you be interested in driving this with our guidance? We just helped two proposals SOAR-0011 and SOAR-0012 in a similar way, and this could be SOAR-0013, describing the strategy for adding typesafe scalars, and adding a few of them (UUID, URL, investigate what other ones make sense).

simonjbeaumont commented 1 month ago

This would technically change runtime behavior, as if an invalid UUID string was received, now we'd fail parsing - but I think that's good, as if the OpenAPI doc constrains a field, we should be able to enforce that constraint without a feature flag. (Do you agree? Does anyone disagree here?)

A more general question about what happens when validation of this nature fails: what happens on the client side if the server sends something that isn't correct? Will this get transformed into an undocumented response? We need for users of generated clients to be able to dig themselves out of a hole if the server (which they may not be able to influence) is misbehaving?

I'm all for the validation being there, but I'm less convinced on the type itself.

my_property:
  type: string
  format: uuid

In this above example the type is string. It says that it agrees to follow a format, which we can validate, but should we go as far as having that be the type of the stored property. Should we instead offer a computed property over the string? Your earlier message implied that a computed type-safe property was incompatible with validation, but I don't understand if that's true.

I'm aware that I'm opening up more questions here and I'm conscious of your time @gentges, thanks for your patience while we get this right :)

czechboy0 commented 1 month ago

We already have precedent here:

my_property:
  type: string
  format: binary

is generated as HTTPBody (in the places where we allow binary data, such as in multipart and top level request/response bodies).

And

my_property:
  type: string
  format: byte # OpenAPI 3.0, spelled as contentEncoding: base64 in OpenAPI 3.1

we generate as Base64EncodedData.

And

my_property:
  type: string
  format: date-time

we generate as Date.

In the latter two cases, we also could have left it as a string, and offered a Data/Date "view", but we opted for a stored property.

We also get the benefit of validating the data once, during decoding. If we move to a computed property, we re-validate on every access, as we don't have anywhere to cache the result. So that's a performance consideration.

Personally, I read hte type: string with a custom format/contentEncoding less as "this is a Swift.String" and more as "when sending this over JSON, we store it in a JSON string, but it's a more constrained format". So I think if we have a type-safe representation in the language, which already knows how to validate itself, conforms to Codable, and leads to only validating once, I'm happy to use it.

I'd feel differently if somehow the validation rules dictated by OpenAPI differed from the validation implemented in the Swift type, but I'm not aware of that being the case here.

czechboy0 commented 1 month ago

I'll mention one other related topic, but it doesn't necessarily need to be bundled into this work.

Validation patterns, such as regular expressions on strings, e.g.:

ssn:
  type: string
  pattern: '^\d{3}-\d{2}-\d{4}$'

and

age:
  type: integer
  minimum: 0
  maximum: 199

It's just another type of validation, and we should think about how we want to represent that. If that's using a special type/generic wrapper, we'll have to answer pretty much the same questions re backwards compatibility and migration.

simonjbeaumont commented 1 month ago

A more general question about what happens when validation of this nature fails: what happens on the client side if the server sends something that isn't correct? Will this get transformed into an undocumented response? We need for users of generated clients to be able to dig themselves out of a hole if the server (which they may not be able to influence) is misbehaving?

And do you have any thoughts on this one?

czechboy0 commented 1 month ago

A more general question about what happens when validation of this nature fails: what happens on the client side if the server sends something that isn't correct? Will this get transformed into an undocumented response? We need for users of generated clients to be able to dig themselves out of a hole if the server (which they may not be able to influence) is misbehaving?

And do you have any thoughts on this one?

I think our current model of "it throws an error", and the workaround of "build a middleware that fixes up the response/request" has worked well so far? Or do you think we need something more in this case?

czechboy0 commented 3 weeks ago

More info: https://spec.openapis.org/registry/format/index.html

apple / swift-openapi-generator

UUID type support #627

Motivation

Modifications

Result

Test Plan