Propagating HTTP status status code when sideloading errors

CGA1123 commented 3 years ago

Currently, if a remote sideload request fails with any response body containing a errors key the GraphitiAPI adapter will raise an Errors::Remote Exception.

This exception contains a message as extracted from the errors key but loses additional information from the response such as the status code, making it a little bit more difficult to have more complex error handling logic around sideloading failures. In some circumstances it might be useful to forward HTTP status codes when a sideload fails, e.g.:

If a sideload returns a 4XX the whole request should return a 4XX
If a sideload fails due to a Faraday::TimeoutError the whole request should return a 504
If a sideload fails due to a 503, the whole request should return a 504

This can be particularly useful to differentiate "real" 5XX errors from a particular server vs errors due to bad forwarded params during a sideload or performance issues bubbling between service (503 vs 504). This rewriting of the status code can help enormously when an upstream host is under load and unable to server requests or otherwise erroring and multiple other services dependent on it begin to degrade!

One way to achieve this would be to expose the status_code within Errors::Remote.

Another way might be to initialise the Faraday connection using the :raises_errors middleware and GraphitiAPI could rescue from those and wrap them to allow more granular handling within Graphiti or by the host application.

Another consideration is whether there should be any scope for "best-effort" sideloading to be enabled, where if a remote resource cannot be fetched due to returning some subset of HTTP status codes the whole request is not necessarily failed. And clients are expected/allowed to retry fetching that failed resource. This would allow for more graceful degradation of service should an upstream API be unavailable or excessively latent for whatever reasons.

I'm not sure whether that suggestion is conformant to the JSONAPI spec, the section on "includes" leaves this a little bit ambiguous as far as I can tell. It might be best to avoid this additional complexity for now. 😄

richmolj commented 3 years ago

All really good stuff to talk about - thanks so much for raising the issue, we should definitely enhance how we handle remote errors.

I think the most straightforward solution is to add status_code to Errors::Remote, and include it as part of the message. It would also be good to add an additional property of the full errors payload - we don't need it as part of the error message, but could be referenced by the developer as part of their own exception handling.

It also makes sense to me for Graphiti to automatically handle the 4XX/5XX scenarios you list above. I think we'd want to add to this list in graphiti-rails.

Another way might be to initialise the Faraday connection using the :raises_errors middleware and GraphitiAPI could rescue from those and wrap them to allow more granular handling within Graphiti or by the host application.

I like this a bit less just because more faraday coupling and hoop-jumping, unless there's a specific scenario you thought it would be beneficial?

Another consideration is whether there should be any scope for "best-effort" sideloading to be enabled

I think this is an important but separate issue. You could make the same case for a local API - if we can fetch the Employees, but Positions fails, should the whole request fail? Currently we say yes. GraphQL would often say no, I think. Ideally we would make this configurable when defining the sideload (e.g. has_many :positions, allow_errors: true) with the current behavior being the default but JSON:API says:

The members data and errors MUST NOT coexist in the same document.

So we'd have to return a successful response without telling the user anything failed. TBQH I think it is OK to cheat here 😈 But again - I would definitely like to see this, but as a separate independent issue.

@wadetandy any thoughts?

CGA1123 commented 3 years ago

I think the most straightforward solution is to add status_code to Errors::Remote, and include it as part of the message. It would also be good to add an additional property of the full errors payload - we don't need it as part of the error message, but could be referenced by the developer as part of their own exception handling.

It also makes sense to me for Graphiti to automatically handle the 4XX/5XX scenarios you list above. I think we'd want to add to this list in graphiti-rails.

I think this makes sense, adding a new attribute to Errors::Remote along with registering and handling it by default in graphiti-rails sounds like a good solution.

I like this a bit less just because more faraday coupling and hoop-jumping, unless there's a specific scenario you thought it would be beneficial?

Yes I agree, this presents somewhat of a leaky abstraction. There are some scenarios already though which may lead Faraday exceptions propagating even without the :raise_error middleware.

e.g. Faraday::TimeoutError, Faraday::ConnectionFailed, Faraday::SSLError

It might be worth handling and wrapping these as an Errors::Remote exception as well?

Ideally we would make this configurable when defining the sideload (e.g. has_many :positions, allow_errors: true) with the current behavior being the default but JSON:API says:

It might be interesting to consider whether the client should be allowed to request the level of fidelity of the includes that it requires/wants? Different clients may decide to be more or less tolerant to sideloading errors and decide to show partial information or try to fetch it themselves via links.

So we'd have to return a successful response without telling the user anything failed. TBQH I think it is OK to cheat here 😈 - But again - I would definitely like to see this, but as a separate independent issue.

Split into #337

richmolj commented 3 years ago

e.g. Faraday::TimeoutError, Faraday::ConnectionFailed, Faraday::SSLError It might be worth handling and wrapping these as an Errors::Remote exception as well?

Good thought - looks like there is a generic Faraday::Error (or maybe ClientError) superclass we can check for.

It might be interesting to consider whether the client should be allowed to request the level of fidelity of the includes that it requires/wants? Different clients may decide to be more or less tolerant to sideloading errors and decide to show partial information or try to fetch it themselves via links.

Also smart thought. I wonder if we should do this similar to how we do link rendering, which can be configured either by url param OR a resource config option (probably sideload config option in this case).

CGA1123 commented 3 years ago

Okay, so to summarise:

Add status_code to Errors::Remote
Rescue from the appropriate Faraday superclass and wrap as an Errors::Remote for edge cases where Faraday may still raise (I think ClientError is what is most appropriate)
Add a handler to graphiti-rails for Errors::Remote which propagates HTTP statuses (behind a config option?)

richmolj commented 3 years ago

Yep, what you've outlines sounds perfect! 💯

jasonkarns commented 2 years ago

Has any work been done on this? I'm in a similar scenario but I think I have a different approach I'd like to see.

Firstly, a registered error handler (ie, via rescue_registry) that is generic for "most" exceptions-that-represent-http-responses really only needs three things:

to expose exception#code as error_code
to expose exception#status as status_code (for Faraday exceptions, this means aliasing response_status as status)
to default the detail: keywarg as :exception

With these things in place, one can register_exception Faraday::Error, handler: HttpExceptionHandler. IMO, this solves the generic scenario of handling most http library exceptions as they mostly all expose status.

I think a separate problem is that sideloading exceptions are wrapped before they reach the rescue_registry. It feels a bit incongruous that exceptions in the main thread hit the registry directly, but exceptions in sideload threads are not treated the same way. If that were configurable, the registered error handlers could respond consistently to exceptions during sideload or main processing.

What I'd love to see is something in rescue_registry that allows a handler to "exit early" and allow subsequent registered handlers to still get a chance to run. (ie, somewhat like middleware; or essentially a "re-raise") This would allow users to register a handler for the Graphiti SideloadQueryBuildingError exception, check the wrapped #cause, and do a number of things: It could either handle the current exception itself; or re-raise the wrapped exception (ideally to be re-matched against the registered handlers) or re-raise the sideload exception and let the other registered handlers still have a chance.

graphiti-api / graphiti

Propagating HTTP status status code when sideloading errors #333