smithy-lang / smithy

Smithy is a protocol-agnostic interface definition language and set of tools for generating clients, servers, and documentation for any programming language.
https://smithy.io
Apache License 2.0
1.79k stars 215 forks source link

How to specify a constant HTTP header value in a Smithy model? #1002

Open glb opened 2 years ago

glb commented 2 years ago

I'd like to model an API operation that requires a specific header value to be provided; I'd prefer not to pass the complexity on to the API consumer.

In my mind this should go into the @http trait applied to the operation shape, but from the docs that's not supported.

Is there an existing trait that would be an appropriate way to do this?

@http(method:"POST", uri: "/", headers: { "api-version": "v1" })
operation DescribeThing {
  ...
}
glb commented 2 years ago

Something like this, perhaps, but really hoping it's already supported somehow and doesn't need to be added:

  /// Configures the HTTP bindings of an operation.
  @trait(selector: "operation")
  @tags(["diff.error.remove"])
  structure http {
      /// The HTTP method of the operation.
      @required
      method: NonEmptyString,

      /// The URI pattern of the operation.
      ///
      /// Labels defined in the URI pattern are used to bind operation input
      /// members to the URI.
      @required
      uri: NonEmptyString,

      /// The HTTP status code of a successful response.
      ///
      /// Defaults to 200 if not provided.
      code: PrimitiveInteger,
+ 
+     /// Static HTTP headers to add to the request.
+     headers: HttpHeaders,
  }
+ 
+ map HttpHeaders {
+     key: String,
+     value: String,
+ }
mtdowling commented 2 years ago

You can't right now, but this is a reasonable feature request.

glb commented 2 years ago

@mtdowling thanks! Does this approach make sense (adding a headers map to the @http trait)? If so, I could take a look at building it after I've finished the PR I'm working on for the last feature I requested. ๐Ÿ™‚

mtdowling commented 2 years ago

That's how I'd imagine this best being described per/operation. There are some things to consider like:

glb commented 2 years ago

Great questions! Here are some proposed answers:

  1. What's the behaviour if someone tries to bind multiple members to the same header / query parameter / payload member with the existing HTTP / JSON bindings? What checks and what precedence rules exist? Are there any?
  2. Does it make sense to follow the rules from RFC 7230? Or does it make more sense to relax the rules, accept any mapping of string:string (no nulls), take the attitude that a service modeller is going to make choices that make their API work, and give them the option to write a validator that lints out things that don't fit their local practices? My personal preference would be the latter.
  3. How about mapping the headers to a map of Header objects where the key is the header name and the schema is an enum with a single string value that is the static value? Codegen from the OpenAPI definition would probably not do the right thing, but the API behaviour would be well- and clearly-defined.
mtdowling commented 2 years ago
  1. What's the behaviour if someone tries to bind multiple members to the same header / query parameter / payload member with the existing HTTP / JSON bindings? What checks and what precedence rules exist? Are there any?

We don't allow conflicts today in header bindings, so I think at least starting with that stance here would make sense.

  1. Does it make sense to follow the rules from RFC 7230? Or does it make more sense to relax the rules, accept any mapping of string:string (no nulls), take the attitude that a service modeller is going to make choices that make their API work, and give them the option to write a validator that lints out things that don't fit their local practices? My personal preference would be the latter.

Enforcing at least parts of RFC 7230 makes sense to me. For example, header field names can't be empty, and we'd want to warn if a header name uses non-ASCII characters (even though the RFC allows for obs-text). For header values, we'd ensure they aren't empty strings, they aren't null, and also warn on non-ASCII. We'd likely want to warn if the header value contains commas too since 7230 basically punts on being able to generically parse HTTP headers without knowing the ABNF of every special header that uses commas.

  1. How about mapping the headers to a map of Header objects where the key is the header name and the schema is an enum with a single string value that is the static value? Codegen from the OpenAPI definition would probably not do the right thing, but the API behaviour would be well- and clearly-defined.

I think that makes sense. There's also the "const" keyword which is basically the same thing but would be more explicit for this case.

glb commented 2 years ago

We don't allow conflicts today in header bindings, so I think at least starting with that stance here would make sense.

Sounds good. Pointers on how / where this validation should be implemented would be super-helpful as I'm not familiar with the code yet and this would presumably imply cross-checking between the headers used in the trait and the headers used in the request shape.

Enforcing at least parts of RFC 7230 makes sense to me.

Makes sense. I'm going to hope that there's already validation for the HTTP binding of headers that I could re-use for the names part. I'm curious about the warning on commas, can you say more about why you'd want to emit a warning there? Is there a way to suppress warnings if the service modeller "knows what they are doing"? Also occurs to me: does it make sense to have a block-list of headers that you can't set, perhaps the same list as for the @httpHeader trait?

There's also the "const" keyword

Sounds perfect, and seems reasonable to expect an OpenAPI codegen to do the right thing with const. Adding a link for future reference.

glb commented 2 years ago

I think I've got the parsing figured out. Next step will be to look at consuming the parsed model in the OpenAPI projection.

re: validation: looks like HttpHeaderTraitValidator has a bunch of stuff that I should re-use. Is there a preference between:

  1. relaxing the private modifier on TCHAR and BLOCKLIST and refactoring validateHeader so that the logic can be re-used between the existing class and a new HttpTraitValidator in the same package
  2. creating a new utility class that HttpHeaderTraitValidator and HttpTraitValidator would re-use
  3. duplicating TCHAR, BLOCKLIST, and the relevant logic in validateHeader

I would lean towards option 1 (calling a static HttpHeaderTraitValidator.validateHeaderName method seems to make intent clear) but looking for your preference as maintainers.

I don't like option 3 at all as it seems like a maintenance issue waiting to happen, but again would respect your preference as maintainers.

I think I understand enough from the existing validators to make an attempt at conflict detection when I get to the validator.

glb commented 2 years ago

Going back to the original problem statement, I think I missed a key requirement: documentation. Using a simple map makes it impossible to have documentation for the headers; you get something like:

Generated OpenAPI (excerpt) ``` { "name": "api-version", "in": "header", "schema": { "const": "v1" } } ```

which is notably missing a description attribute.

It would be unfortunate if the way we model this makes it impossible to document these static header values.

New trait shape suggestion:

@http(method:"POST", uri: "/", headers: [ { name: "api-version", value: "v1", description: "The API version." })

or as a diff to the existing structure:

@@ -571,6 +571,31 @@ structure http {
     ///
     /// Defaults to 200 if not provided.
     code: PrimitiveInteger,
+
+    /// The constant HTTP headers to add to the request.
+    headers: HttpHeaders,
+}
+
+list HttpHeaders {
+    member: HttpHeader,
+}
+
+structure HttpHeader {
+    /// The header name.
+    ///
+    /// The name must be valid according to the rules in RFC 7230 and must not be
+    /// in the restricted headers list.
+    @required
+    name: NonEmptyString,
+
+    /// The constant header value.
+    ///
+    /// The value must not be an empty string.
+    @required
+    value: NonEmptyString,
+
+    /// The header description.
+    description: String,
 }

Thoughts?

adamthom-amzn commented 2 years ago

Seems fine to me, I'm guessing there should probably be some minimal constraints on the description if it's present. OpenAPI's restriction, if any, on the description might be a reasonable starting point.

glb commented 2 years ago

An alternative that also solves the documentation problem would be a map where the value is a structure:

@http(method:"POST", uri: "/", headers: { "api-version": { value: "v1", description: "The API version." } })

Any preferences?

glb commented 2 years ago

Got what I was looking for:

                    {
                        "name": "api-version",
                        "in": "header",
                        "description": "The API version.",
                        "schema": {
                            "const": "v1",
                            "description": "The API version."
                        }
                    }

not sure why the double description, but that's what I'm seeing other things do so I'm going with it for now.

Initial implementation uses the map:

 map HttpHeaders {
     key: NonEmptyString,
+    value: DocumentedHttpHeader,
+}
+
+structure DocumentedHttpHeader {
+    /// The constant header value.
+    ///
+    /// The value must not be an empty string.
+    @required
     value: NonEmptyString,
+
+    /// The header description.
+    description: String,
 }

but I think I understand enough to switch it to a list if people would prefer that.

glb commented 2 years ago

I think I've got most things working for this, just need to implement the validator per @mtdowling 's comments earlier:

We don't allow conflicts today in header bindings, so I think at least starting with that stance here would make sense.

[W]e'd want to warn if a header name uses non-ASCII characters (even though the RFC allows for obs-text). For header values, we'd ensure they aren't empty strings, they aren't null, and also warn on non-ASCII. We'd likely want to warn if the header value contains commas too since 7230 basically punts on being able to generically parse HTTP headers without knowing the ABNF of every special header that uses commas.

Will open a draft PR for more detailed discussion once that's done.

mtdowling commented 2 years ago

Awesome! I think the map is fine and will be a bit more concise. Should we name this requestHeaders since we only expect these headers in requests and not responses?

Another option to throw out -- we're very likely going to expand mixins to more shapes, including operations. It might be nice to do something like this to not have to repeat the headers on every operation:

@mixin
@httpStaticRequestHeaders("api-version": { value: "v1", description: "The API version." })
operation MyServiceOperation {}

operation PutFoo with MyServiceOperation {
    // ...
}
glb commented 2 years ago

Awesome! I think the map is fine and will be a bit more concise. Should we name this requestHeaders since we only expect these headers in requests and not responses?

Makes sense, will do ๐Ÿ‘

Another option to throw out -- we're very likely going to expand mixins to more shapes, including operations. It might be nice to do something like this to not have to repeat the headers on every operation:

@mtdowling I do like the reduced duplication this offers. I am torn by the implication that this implies creating a separate trait @httpStaticRequestHeaders and plumbing it throughout, something I don't yet know how to do, rather than extending the @http trait, something I just painfully learned how to do and which seems thrillingly close to being done (I'd even updated the docs), placing me within hours of a workingโ€  client (codegen for appropriate middleware to implement static headers is my next and I think final-ish step).

โ€  "working" as in it would make correct-ish requests, modulo some annoying behaviours that I need to track down and probably raise issues to discuss once I understand them.

That said, it will be some time before my side project is anything but a technology demonstration, so taking the time to get the modelling right is better than building something quickly that's not as useful. I can still build my demo client with what I've done locally and circle back to rebuild this as a separate trait if that's the right path.

What I'm hearing is an evolution of the thought process here, that we would not want to extend the @http trait and would rather implement a @httpStaticRequestHeaders trait, perhaps something like this:

@trait(selector: "operation")
map httpStaticRequestHeaders {
    /// The header name.
    key: NonEmptyString,

    /// The static header value and its associated description.
    value: DocumentedHttpHeaderValue,
}

structure DocumentedHttpHeaderValue {
    /// The constant header value.
    ///
    /// The value must not be an empty string.
    @required
    value: NonEmptyString,

    /// The header description.
    description: String,
}

which once mixins are introduced would allow for a mass deletion of duplicated traits. All of the requirements and behaviour we'd discussed would transfer to this new trait.

It makes sense. Have I played back your thoughts accurately?

glb commented 2 years ago

Naming question: is there a preference for "static" headers vs "constant" headers? I'd lean towards "constant" these days.

mtdowling commented 2 years ago

Sorry for the delay! (holidays, got sick, better now).

Yeah, your summary is correct, and I like your proposed trait. (One minor point is that I don't think we've ever used "description", but rather always use "documentation")

What I'm hearing is an evolution of the thought process here

Exactly. The reduced duplication from mixins is going to be a huge DX improvement, and we should start thinking more about mixins as we design new traits. This means that we should prefer more granular traits like @httpStaticRequestHeaders.

Naming question: is there a preference for "static" headers vs "constant" headers? I'd lean towards "constant" these days.

I don't have a strong preference here, and I polled the Smithy team, and nobody else has a strong preferece. I'll defer to you to make this call :)

glb commented 2 years ago

Thanks @mtdowling! Happy new year, glad you're feeling better.

When I can get back to this I'll proceed with documentation instead of description and constant instead of static.

Any pointers on how to create a brand-new trait and plumb it through end-to-end (just in this repo, not all the way through codegen) would be greatly appreciated; I'll dig through old PRs but a checklist / README would make this process significantly more straightforward.

mtdowling commented 2 years ago

Awesome! First you define the trait in the prelude: https://github.com/awslabs/smithy/blob/main/smithy-model/src/main/resources/software/amazon/smithy/model/loader/prelude.smithy. Then you define the trait in code (like this one). Then you register it with Java's SPI (like this). Then write a simple test to make sure it's wired up correctly (like this). Then we usually write trait docs (like this), and test that by running make html in the docs directory. Then there's the OpenAPI converter... I think we'd need to update this class: https://github.com/awslabs/smithy/blob/main/smithy-openapi/src/main/java/software/amazon/smithy/openapi/fromsmithy/protocols/AbstractRestProtocol.java, plus add some compliance tests here: https://github.com/awslabs/smithy/blob/main/smithy-openapi/src/test/java/software/amazon/smithy/openapi/fromsmithy/protocols/AwsRestJson1ProtocolTest.java. That should be a good starting point, but let us know if you have specific questions.

glb commented 2 years ago

That should be a great start, thanks @mtdowling ! Are those steps documented outside your comment or would it make sense for me to include a Markdown file in my PR capturing the steps for the next person to come along?

kstich commented 2 years ago

Only the process of defining the trait in a model is outlined, that's done so here in the spec. We'd gladly review anything you capture during your development as we plan to build a more full fledged guide.

In addition to what @mtdowling outlined above, this warrants a validator to emit events for conflicts with @httpHeader and @httpPrefixHeaders. The validator for @httpPrefixHeaders is here and would be registered with the validation SPI (like this.)

glb commented 2 years ago

Thanks @kstich !

glb commented 2 years ago

@mtdowling earlier you mentioned that mixins for operations are likely coming.

I seem to recall something about mixins not merging traits well; I don't know if that's still the proposal but it came to mind as a potential issue.

I'm sure you're familiar with some AWS services that take an x-amz-target header that holds the operation name, SOAP-style.

Building on the example you shared earlier, I was wondering about the feasibility of this, assuming that httpStaticRequestHeaders (pre-rename) trait were extended to support merging:

@mixin
@httpStaticRequestHeaders("api-version": { value: "v1", description: "The API version." })
operation MyV1ServiceOperation {}

@httpStaticRequestHeaders("x-amz-target": { value: "PutFoo", description: "The operation being performed."})
operation PutFoo with MyV1ServiceOperation {
    // ...
}

or would you have them define a separate trait that also sets a constant header, something like:

@amzTarget(operation="PutFoo")
operation PutFoo with MyV1ServiceOperation {
  // ...
}

To me it feels like it would be good if traits could be merged, so that folks don't need to know the exact implementation details of mixins. It feels a little fragile that someone could do something like

@httpStaticRequestHeaders("x-amz-target": { value: "PutFoo", description: "The operation being performed."})
operation PutFoo with MyV1ServiceOperation {
    // ...
}

not realizing that MyV1ServiceOperation also brings in the constant header trait, and breaking the model.

This is probably not an issue specific to this trait, happy to split it off to a separate issue about mixin behaviour and the possibility of traits having merge strategies.

eduardomourar commented 2 years ago

@mtdowling , wouldn't make more sense to allow constant values altogether (as defined in JSON Schema draft 2020-12)? So they can be added as part of any shape and also for documentation purposes (e.g. in the OpenAPI definition).

glb commented 2 years ago

@eduardomourar do you have any more thoughts / details on what that would look like in this use case?

eduardomourar commented 2 years ago

@glb, when I first started using Smithy I expected to do something like this:

/// Some description to be added to the constant
@const("v1")
string ApiVersionV1

@http(method:"POST", uri: "/")
operation DescribeThing {
  @httpHeader("api-version")
  apiVersion: ApiVersionV1,
  ...
}

As you can see, if the const trait were available in Smithy IDL, we would be able to have constant values anywhere we want not only in headers.

mtdowling commented 2 years ago

This PR is about static HTTP request headers specifically, and not about more general constant values like what's in JSON Schema. The key difference is that static HTTP request headers are not part of the end-user contracts generated by Smithy (they're implementation details of the serialization format) whereas something like constants would be. I haven't heard a strong use case for more general constant values in structures, and I think they come with big tradeoffs. For example, poor integration with structurally typed languages like TypeScript (unless we added constructors to fill in these constant value, because we'd likely be forcing the constant to be reified in code to satisfy the type checker), and a backward compatibility risk for service teams that might need to later change the constant value.

As for the x-amz-target example: this header is an implementation detail of a protocol that shouldn't be modeled at all. It's not something a client would send when you build a client for awsRestJson1, and sending it when unnecessary is problematic for services that might support both protocols simultaneously.

Regarding mixin trait merging: they don't support trait merging today, no. It sounds like a lot of complexity to me if we had configurable strategies per mixin (like override vs merge), and would complicate reasoning about the model itself + implementations. If the site at which a mixin is merged into a shape controlled the override behavior, then I worry we'd need a syntactic change to the with statement. I suppose you could define in the @mixin trait a trait merging strategy but I'd be worried about people constantly forgetting to apply it when they actually intended to. Any other good options you had in mind?

glb commented 2 years ago

@eduardomourar that's kind of what I expected too ๐Ÿ™‚

@mtdowling your last comment, specifically the bit about how x-amz-target shouldn't be modelled at all, has been enlightening. I'm wondering if I have been thinking about this all wrong. I think I've been clear from the beginning that my primary motivator here has been to model an API version header that our APIs use. I wonder if I should be contemplating a parameterized protocol trait instead of this httpConstantRequestHeader trait, probably implemented as a subclass of aws.protocols#restJson1, that isn't part of the core Smithy code base and has its own codegen to implement the behaviour I need.

I'd probably model this with something like:

use example.protocols#versionedRestJson

@versionedRestJson(version: "v1")
service ExampleV1Service {
   ...
}

I've discovered that there's an example that is basically exactly this in the protocol traits documentation:

@protocolDefinition
@trait(selector: "service")
structure configurableExample {
    @required
    version: String
}

@configurableExample(version: "1.0")
service WeatherService {
    version: "2017-02-11",
}

Thoughts?

mtdowling commented 2 years ago

If you're defining a static value across the API, then it's fair to reuse the restJson1 protocol and use a header. The x-amz-target header is dynamic though and already needs a dedicated protocol implementation.

I think another way to address your concern is to allow httpStaticRequestHeaders on a service shape. Then those headers are merged with any static headers on each operation bound to the service (headers on operations would either override headers on a service or maybe conflicts wouldn't be allowed). It's a little extra bookkeeping when doing stuff like codegen, but helps ensure that any headers meant to be applied across all operations aren't omitted.

glb commented 2 years ago

@mtdowling got it, thanks. I like the idea of supporting this at both levels. I'd suggest semantics of "if you specify the same header name on a service and an operation then the operation will take precedence". I don't like the idea of it being impossible to override the default service value for an operation, and I expect that folks won't accidentally add the trait to an operation. Thoughts?

syall commented 1 year ago

Hi @glb, did you find a solution, via protocol trait or other avenue, or do you still need this feature request open?

glb commented 1 year ago

Hi @syall thanks for checking in! I didn't find a solution as I timed out looking at this activity last year. I would definitely still like to have a capability for modelling an API with a constant header as part of the core Smithy feature set, especially with the expanding set of SDK code generators, but I'm not able to contribute an implementation at this time ๐Ÿ˜ž

I would prefer not to have to build a new protocol binding for each code generator to get this capability, though if the position of the maintainers is that a protocol binding is the right way to model this I'd understand.