smithy-lang / smithy-dafny

Apache License 2.0
9 stars 8 forks source link

Better alignment of errors in Polymorph with Smithy semantics #450

Open robin-aws opened 3 months ago

robin-aws commented 3 months ago

One of the reasons errors are modeled in Smithy is so that customer code in any programming language can handle at least some errors programmatically. This includes being able to test for a specific type of error, and possibly reading structured values out of it.

Many programming languages will use polymorphism to support error handling in general. In Java, for example, you can throw and catch any subtype of Exception, and test if the exception you’re holding onto is a specific kind with something like e instanceof ThrottlingException. But Smithy does not support arbitrary polymorphism by design, and it is not idiomatic in all languages to define or use an open polymorphic error type.

Dafny is similar to Rust in this way: it has no special built in error type, and no universal top type, so it isn’t possible to hold onto an arbitrary “error value”. smithy-dafny currently generates a single top-level service-specific Error type that can represent any error from the whole service, and any dependent service in the case of @localServices. This includes an OpaqueError(error: object) data constructor, which is perfectly reasonable as any client or service or polymorph library needs to be able to represent unmodeled errors somehow. But automatically making any error from any service in the whole dependency graph part of the API doesn’t align with Smithy semantics, and therefore complicates the Polymorph implementation for each supported language and hurts reuse of existing smithy-<lang> tools. It is especially a problem for @extendable resource operations, because new implementations should not be able to add more error possibilities.

Therefore I propose the following changes:

  1. We clarify the semantics of Polymorph to align with the Smithy specification: that only the errors listed in an operation’s errors (or indirectly in the service’s common errors list) can be handled programmatically as structured result values in each target language. We call any other kind of error “unmodeled errors”, even though the underlying error value may correspond to some other Smithy @error shape.

    Unmodeled errors MAY have to be wrapped as an unhandled error of some kind, i.e. as a different type than how it would be represented as a modeled error for a given operation. ~We only make a best effort to include as much detail as possible when the error is displayed/logged/etc.~ Edit: We only ensure that the error is retained and possible to log or otherwise extract programmatically somehow, just not in a clean and consistent way across programming languages.

    1. This allows new support for languages like Go and Rust to be more idiomatic and avoid having to handle the structure of the Dafny service-wide Error type.
  2. The Dafny interface for services currently uses the single service-specific Error type everywhere, for both SDKs and local services. We will keep this type for use in implementations, but also generate per-operation error subset types and use them in the operation signatures instead, to more accurately represent Smithy semantics. That is, the signature returns (output: Result<FooOutput, Error>) will become returns (output: Result<FooOutput, FooError>) , where FooError is a subset type with Error as a base type, restricting to only the explicitly declared error shapes. This is thankfully not a breaking change since it's only making a return type more specific.

    1. The generated code connects the abstract MyService.Foo to MyServiceOperations.Foo will call a generated method to convert a Error to a FooError by converting all variants that aren’t declared in the operation’s errors to OpaqueError values instead.

      This will require an extern to wrap an arbitrary Dafny value as a Dafny object, which isn’t generally possible since not all Dafny values are reference types. If possible we should even modify OpaqueError to hold onto a completely opaque extern type declared in the smithy-dafny StandardLibrary instead of object.

    2. We will also modify the conversion logic to wrap target language interfaces back into Dafny (“Wrapped services”) to wrap up unmodeled errors as OpaqueError. This is important to ensure wrapped service tests cannot require error handling behavior that cannot/should not actually be supported in a given language, such as testing for error.DependencyA.DependencyB.SomeError? (as we do here for example).
  3. Since existing polymorph library operations have been under specifying the set of errors, we can incrementally add the key error shapes that customers will commonly want to handle programmatically to their Smithy models.

    1. Note that converting an umodeled error to a modeled error for a given operation is not considered a breaking change, since it doesn’t change when an error occurs, just adds the ability to programmatically handle it better. For many target languages there is no actual change because we were already converting unhandled errors for other Smithy shapes to the correct type of error or exception.
  4. Instead of requiring that the nested structure of MyService.Error is maintained in all languages, we must instead ensure that error information is not lost when working with unmodeled errors. We will replace tests like this with tests that unmodeled errors, especially those from target language implementations, are chained and wrapped as needed to ensure that logging them can include relevant information.

On CollectionOfErrors

It should always be possible to enable programmatic handling of the set of errors for an operation somehow in a given programming language, because the result of a Foo operation can always be represented as an equivalent Smithy union of either a successful output structure or one or more error variants:

operation Foo {
  input: FooInput
  output: FooOutput
  errors: [BadThingError, SomeOtherError]  
}

// Implicitly implies something like:
union FooResult {
  output: FooOutput
  badThingError: BadThingError
  someOtherError: SomeOtherError
}

Therefore as long as a code generator supports unions, it can support errors, even if it has to use the same kind of datatypes to handle the requirements for unions. Rust for example, generates a FooError enum that looks similar, plus an Unhandled variant.

CollectionOfErrors is absolutely useful for Dafny-implemented polymorph libraries, for the same reason that features like exception chaining are useful in other languages: it lets you hang onto the details of more than one error in a single value. However, CollectionOfErrors is particularly difficult to reconcile with the Smithy design: the Dafny version of it is essentially a list of MyService.Error values, meaning each element can be any error for MyService , and MyService.Error has no corresponding shape in the original Smithy model.

Therefore we should consider the Dafny CollectionOfErrors variant an unmodeled error, which we only ensure can be printed with all details and not necessarily programmatically handled. In cases where customers may want to programmatically handle a collection of errors, this can be modeled explicitly in the Smithy operation:

operation Foo {
  // ...
  errors: [FooErrors]
}

@error
structure FooErrors {
  errors: FooErrorList
}

list FooErrorList {
  member: FooError
}

union FooError {
  badThingError: BadThingError
  someOtherError: SomeOtherError
}

// @error not strictly necessary on these,
// but it could easily be reused directly for other operations.
// They might also be from other dependent local services.
@error
structure BadThingError {
  @required
  message: string

  @required
  thingID: string
}

@error
structure SomeOtherError {
  @required
  message: string
}

This is fairly similar to how AWS services model the results of batch operations. Note that the structure of the Dafny FooError generated for this model ends up looking very similar to the existing CollectionOfErrors variant that smithy-dafny always generates.

ajewellamz commented 3 months ago

We only make a best effort to include as much detail as possible when the error is displayed/logged/etc. If there is any situation in which the customer can no longer see the text of an error message, I would call this whole thing a non-starter.

robin-aws commented 3 months ago

Good call, "best effort" is the wrong phrase here. Replaced with:

"We only ensure that the error is retained and possible to log or otherwise extract programmatically somehow, just not in a clean and consistent way across programming languages."

Edit: My edit didn't stick for some reason, re-applied

robin-aws commented 1 month ago

Note that I'm not planning on acting on this design for a while - the models of existing smithy-dafny projects like https://github.com/aws/aws-cryptographic-material-providers-library are designed with the assumption that features like CollectionOfErrors percolate through all libraries and across languages. The design above would support a better state, but assumes such libraries add a lot more explicit error types to retain at least as good a user experience.

texastony commented 4 weeks ago

This MAY be better expressed as separate Issue, but it would be very nice if the Opaque errors had a message field in addition to on object field.

I would like to say "Unexpected error encountered while executing a DDB BatchGetItem", but right now I have to just return the Opaque.