ResponseValidation: add option to become informative

oats-ts / oats-ts

Comprehensive code generators for OpenAPI written in Typescript

https://oats-ts.github.io/docs

MIT License

22 stars 1 forks source link

ResponseValidation: add option to become informative #147

Open erictheswift opened 9 months ago

erictheswift commented 9 months ago

Hey!

When using FetchClientAdapter, currently I have 2 options:

config.skipResponseValidation=true: die on any inconsistency, no option to recover config.skipResponseValidation=false: be ignorant of all spec/real world inconsistencies

I find response validation quite nice feature, however, I'd prefer it to serve reporting-only purpose.

In many cases I'd rather log this error to error tracking solution (e.g. sentry), log it to console for development and proceed with execution on slightly faulty data.

Proposal:

-type Config = {url: string, skipResponseValidation?: boolean}`
+type Config = {baseUrl: string, onResponseValidationError?: null | (err: Error, ctx: Ctx) => void}
type Ctx = {response: OatsResponseBody} & (
  | {_:'statusCode', statusCode}
  | {_:'mimeType', mimeType}
  | {_:'body', issues: Issue[]}
)

So that in case you need to die, {onResponseValidationError: (err) => {throw err;}} (default?)

in case you need to ignore, {onResponseValidationError: null}

in case you need to report without dying, {onResponseValidationError: (err,ctx) => { console.error(err, ctx); Sentry.captureException(err, {extra: ctx}) }}

bali182 commented 9 months ago

Hello,

I'm pretty pragmatic on this in terms of what goes into the library. In case validation of anything that comes back from the server fails it means that:

A) The server is not respecting the OpenAPI document Oats was generated from B) Oats is interpreting something wrong

If it's A) I want the client to die there, I want it to annoy you and I want you to fix it either in the OpenAPI contract, or the server. If you can't trust the shape of the data you receive, you can't trust the backend you are talking to, and it should be fixed in development as soon as possible. Tbh I would have removed skipResponseValidation, but I wanted to leave in a higher performance option for production, when you are 100% sure the shape of the response is correct.

If it's B) obviously open an issue here.

So exposing a configurable handler for errors is a definite no, as I don't want this to be a convenient thing to do.

That said, you probably have a good reason to ask this, so what I can offer instead, is pulling out the throws into 1 common function, where you can decide to throw or omit the throw and just log it somewhere.

Let me put together an example, and we'll see if it would help.

erictheswift commented 9 months ago

There are definitely cases where API can contradict to the spec:

e.g. when there is no spec on the server side and ephemeral spec is reverse engineered (generated from network requests/app data) — that's where reporting can be used for crowdsourcing

or API provider exports the spec to promote usage but does not conform to it at all on the server side (not pointing to anyone here, but real product)

Also there may be small (insignificant) backward-incompatibile API changes where API provider runs newer version while there are not yet updated consumers, which of course satisfies A), however real world scenarios may render reporting warning instead of instant death as more wishful scenario.

erictheswift commented 9 months ago

Thanks. Looked at proposal:

Parametrizing the throwing method via adapter method (OO-style) also looks fine for me BUT

it lacks access to response object that was used to generate these validation issues,
I'm not sure if responseBody path is expected to be longer in deeper scenarios (is it just the simple test case?): currently I get kinda-readable violations in message, e.g. [ERROR] in "["["responseBody.groups"][4]"].createdAt": should be a string, I'd love to have these in parts after responseBody in path.
I'd rather name just body instead of responseBody because statusCode, headers are all parts of a response that is a context we're in (responseIssues)
Having Issue["path"] a (string|number)[] type instead of a string would be helpful to access invalid parts via _.get avoiding string-to-path parsing edge cases. This way [ERROR] in "["["responseBody.groups"][4]"].createdAt": should be a string transforms to an Issue {path: ['body', 'groups', 4, 'createdAt'], ...}

bali182 commented 9 months ago

Regarding the real world examples, I'm well aware of each situation, have been fighting this for years in in-house projects :D.

Anything the validator reports is not a minor issue, it's something that can and will cause real runtime errors (or at least my intention was to highlight these). If a new field is added to an object for example, it won't even report it. However if an expected schema element is of wrong type or missing (eg.: you expect an array element or object field to be a number but now it's a string) it will blow up. I wouldn't call this an insignificant difference, as it can brick your app down the line somewhere, where you least expect it, resulting in a few hours of head scratching.

For the list:

1) Can do 2) Yes, it can be like 'responseBody.foo["this has a space"].bar[1].x... as deep as the error goes. If a key is a valid variable name it's going to be .name, otherwise [123] if an array index and ["a b c"] if a string key. 3) There is requestBody and responseBody,I want to keep these separate by name. 4) I was thinking about this, but I want this library to be extensible for non json formats too. I want to keep path a string, that can be easily converted to xpath for example or something else in the future.

Keep in mind that this might change in the future, but for now If you want custom paths in reported errors, then you could override another method in FetchClientAdapter like so:

import { Validator, configure, ConfiguredValidator, DefaultConfig, Issue, stringify, } from '@oats-ts/openapi-runtime'

//... the rest of the class
protected override configureResponseBodyValidator(validator: Validator<any>): ConfiguredValidator<any> {
  return configure(
    validator,
    // The first path segment, you can change it to body 
    'body', 
    // Function defining how path segments are appended
    { append: (path, segment) => `${path};${segment}` }
  )
}

And then you can get the list of segments you wanted back pretty easily, assuming the separator character is not used in keys:

const keys = issue.path.split(';').slice(1)
const value = _.get(body, keys) // assuming numeric keys as string is not a problem for this

bali182 commented 9 months ago

If I get to eventually get the parameter-content branch in, a lot of validation will be extended and unified, I will keep in mind that the validation issues are meant to be easily configurable!

erictheswift commented 9 months ago

Anything the validator reports is not a minor issue, it's something that can and will cause real runtime errors (or at least my intention was to highlight these)

Yeah that's clear to me that by default that should be strict.

The point is to allow explicitly configuring 'wild west' scenarios (without patching) where the possibility that app will finish running with glitches like '[object Object]' in UI outweighs the downsides of missed strict validation.

How about ctx: {request: RawHttpRequest, response: RawHttpResponse}?
Great

There is requestBody and responseBody,I want to keep these separate by name.

Then, if expecting to have both request and response in scope of validation, would it be more consistent to have responseHeaders or response.headers prefix instead of headers, as well as response.statusCode instead of statusCode?
Makes a lot of sense about XPath plans.

Btw lodash getters easily work over dots and square brackets string paths (e.g. _.get(object, 'a[0].b.c') from official docs) — no need to configureResponseBodyValidator.

With string path, however, seems my example with path in ["["responseBody.groups"][4]"].createdAt from error message that I extracted from [ERROR] in "["["responseBody.groups"][4]"].createdAt": should be a string is erroneous?

Should it be responseBody.groups[4].createdAt, or responseBody.groups['id with a space'].createdAt (in case when id is non-valid variable name instead of index 4)?

With p.1, p.3, p.5 implemented/fixed, it should be possible to access every erroneous value via _.get(ctx, issue.path), right?

bali182 commented 9 months ago

Order is a bit mixed but:

3) That is a very good idea, I opened #149 as it can be done separately from the rest

5) If you get something like that in the path, that definitely looks like a bug! Could you open an issue and provide a bit of detail how to repro? A partial schema, and a response (without any sensitive data) would be very helpful on figuring out how that happened.

1) And for this let's continue the discussion. We could add a way to delegate responsibility of recovering from errors completely to the user (you). This way you are not just digesting the error in a callback, but also are responsible to come up with the appropriate parts of the response yourself in case of an error. If this was the original suggestion, and I just didn't understand it, I'm sorry :(

Considering the idea with request and response present, how about extending the config of the fetch adapter to accept something like this:

const adapter = new FetchClientAdapter({
  url: '...',
  recoverResponseBody: (request: RawHttpRequest, response: RawHttpResponse, issues: Issue[]): any => {
    SomeLogger.log(issues)
    // Even though something blew up in the chain, 
    // let's say you are sure that the response is JSON, you can do this
    return JSON.parse(response.body)
  },
  recoverResponseHeaders: (request: RawHttpRequest, response: RawHttpResponse, issues: Issue[]): any => {
    SomeLogger.log(issues)
    // Also because you have the request, you can narrow it down when you actually want to recover 
    // and when you want to keep throwing
    if(request.url.startsWith('...')) {
      return { x-header: response.headers['x-header'] ?? 'default' }
    }
    throw new Error(issues.map(stringify).join('\n'))
  }
})

If the recovery methods are not provided as config, then it will just throw as is currently. If provided however, the recovered body and/or headers will be present on the final typed response. It's really crude, but responsibilities are very clear:

1) If everything goes according to plan, the generated code gives you a properly parsed and typed response. 2) If something breaks, you can take manual control, and decide about all aspects of how to recover (if you wanna recover).

erictheswift commented 9 months ago

Hmmm I haven't thought about the idea of 'recovering'. Not sure if applicable to any of my cases, but quite fun strategies can be imagined — e.g. putting null-safe empty string or -1 somewhere where otherwise there'd be invalid response with potentially faulty undefined. Idk.

I'd say one hook like

  handleValidationIssues(issues: Issue[], ctx: {request: Request & {query: ...}, response?: Response}): void | never => {
    if (issues.every(i => i.path.startsWith('response.body'))) {
       logError(...)
       Sentry.captureException(...)
    } else {
       super.handleValidationIssues(issues, ctx) // doing `throw new Error(issues.map(stringify).join('\n'))`
    }
  },

as an adapter override or config option where

Issue['path'] is kinda
  | ['request', 'body' | 'query' | 'headers', ...keys].join('.')
  | ['response', 'statusCode' | 'body' | 'headers', ...keys].join('.')
and thus resolveable via _.get(ctx, issue.path)

would fit any of my existing demands.

bali182 commented 9 months ago

Purely handling the validation issues from the libraries point of view, was not the best idea from me because it's unpredictable what is being returned in a fail situation if no exception is thrown .

Both the response body and response header processing part takes something raw, and turn it into what you actually want (parsed data, that matches what is defined in the generated type system).

That's why I'm suggesting that if the parsing/validating fails somewhere, you'd have an option to take control over both error handling and what is being put in the final typed response.

This way what the library / generated code offers is this:

If all parties play according to the contract set by the source OpenAPI document, you'll get back a proper, typed response
If not, it tells you what the problem is, and you can substitute the unusable parts in user defined business logic, otherwise it just throws.

This way I think it would be much more future proof in other use cases as well.

erictheswift commented 9 months ago

👍 Motivation for contract that has to return recovered item or throw makes sense to me, kinda fool-proof design.

bali182 commented 9 months ago

FYI just checked, it's going to take a bit, I have to change the generators a bit. You'll also have to regenerate once this is in. Shouldn't change the pulbic facing APIs though, so code where you use the generated code should be safe.