Proposal: Improve error handling for passthrough scenarios

shafreenAnfar commented 2 years ago

Summary

Ballerina can be used to write fully fledged gateways using its network abstractions. Usually gateways passthrough the request and response. In other words, it does not read the content. When implementing such scenarios using Ballerina, there are limitations in error handling which prevents users from writing complete gateways. This proposal is to address those concerns.

Goals

Complete the error handling story of Ballerina when it comes to passthrough scenarios

Motivation

Ballerina programing language's key strengths are network, concurrency and data manipulation. Therefore, it is common for Ballerina users to write integration scenarios. One of the common integration scenarios is passthrough scenario. As the name says passthrough scenario basically forward the request and response back and forth. It doesn't read the entity body of the payload but may read the headers. This scenario is quite common in gateway implementations as well.

However, implementing passthrough scenario using Ballerina does have limitations as per the current APIs. Consider the below simple passthrough service.

service / on new http:Listener(9090) {
  resource function 'default passthrough(http:Caller caller, http:Request req) returns http:Response|error? {
    http:Response inRes = check clientEP->forward("/get", req);
    check caller->respond(inRes);
}

Here, even though both forward and respond return errors, they don't consider the scenario where the given action could fail due to the passed request or response. For instance, consider the below code line.

check caller->respond(inRes);

This code line could fail if something goes wrong when retrieving the entity body from the inRes. Because Ballerina returns the response to the user as soon as the headers are retrieved. Reason for this is there is no point in deserializing the entity body if the user code is not using it. It is sort of a lazy loading implementation. Even though it is not common to have errors when reading the payload, with the introduction of HTTP2 streams this could happen.

As streams allows a single connection to multiplex, there could be long running streams. Especially, in the case of gRPC where the underling implementation is based on HTTP2 streams. Therefore, when writing HTTP2 gateways using Ballerina to passthrough gRPC content, this type of errors could happen quite frequently.

Description

Error handling could be improved by introducing a few new error types along with new actions. For instance, consider the below code line of passthrough example.

check caller->respond(inRes);

In this case respond action needs to read the entity body of inRes in order to send it to the original caller. However, if something goes wrong while reading the entity body, it could return the below error.

type ClientError distinct error;
type InboundEntityBodyError distinct ClientError;
type ResetInboundStreamError distinct InboundEntityBodyError; // returned error

Following is how these error could be created.

ResetInboundStreamError inRstError = error ResetInboundStreamError("Stream was reset", code = http:PROTOCOL_ERROR);

Once created the error could be returned from the respond action in the case of a failure. Users can decide what action to take based on the failure.

In order to take some action, the following two methods are introduced to both http:Caller and http:Client.

# Closes the TCP connection
remote function close(); // mostly useful in HTTP 1.1 scenarios

# Sends the RST_FRAME with the relevant error code
#
# + code - Error code
remote function resetStream(int code); // useful in HTTP2 scenarios

Similar to respond action, forward action could return the below error.

type ListenerError distinct error;
type OutboundEntityBodyError distinct ListenerError;
type ResetOutboundStreamError distinct OutboundEntityBodyError; // returned error

With the introduced new API one can improve the original passthrough example as below.

service / on new http:Listener(9090) {
  resource function 'default passthrough(http:Caller caller, http:Request req) returns http:Response|error? {

    http:Response|http:ClientError response = clientEP->forward("/get", req);
    if response is ResetOutboundStreamError {
      check clientEP->resetStream(response.detail().get("code"));
    }

    http:ListenerError? result = caller->respond(response);
    if result is ResetInboundStreamError {
      check caller->resetStream(result.detail().get("code"));
    }
  }
}

In addition to, forward action http:Client has many other actions such as PUT, POST and DELETE which support sending and receiving entity bodies. Therefore, those actions also should support returning the aforementioned errors.

chamil321 commented 2 years ago

Clarification on the internal behaviour:

service / on new http:Listener(9090) {
  resource function 'default passthrough(http:Caller caller, http:Request req) returns http:Response|error? {

    http:Response|http:ClientError response = clientEP->forward("/get", req);
    http:ListenerError? result = caller->respond(response);
    if result is ResetInboundStreamError {
      check caller->resetStream(result.detail().get("code"));
    }
  }
}

When above case is implemented, suppose inbound response payload(upstream) is differed for sometime and client time out is triggered. In the previous implementation, the inbound response was completed by setting a decoder exception as the lastContent. Then the outbound response(downstream) had no any lag and the stream was ended without notifying any error to the caller.

With the proposed change, once the content has decoder exception, it is notified to the application level an error. Until the user specifically send the respond, the downstream is connected. Hence user has to specifically handle this case.

VirajSalaka commented 2 years ago

For microgateway 3.2.0, there are users who perform dynamic routing from filters or and interceptor implementation. In those occasions, the user's custom ballerina implementation embedded within microgateway project can contain

clientEP->forward("/get", req);

and

caller->respond(response)

With the proposed change https://github.com/ballerina-platform/ballerina-standard-library/issues/2456#issuecomment-987617526, the users needs to update their own implementation right? I mean otherwise, there could be so many open streams unless they do that change. (without the fix, it would be half closed streams)

chamil321 commented 2 years ago

Clarification on the internal behaviour:
service / on new http:Listener(9090) {
  resource function 'default passthrough(http:Caller caller, http:Request req) returns http:Response|error? {

    http:Response|http:ClientError response = clientEP->forward("/get", req);
    http:ListenerError? result = caller->respond(response);
    if result is ResetInboundStreamError {
      check caller->resetStream(result.detail().get("code"));
    }
  }
}
When above case is implemented, suppose inbound response payload(upstream) is differed for sometime and client time out is triggered. In the previous implementation, the inbound response was completed by setting a decoder exception as the lastContent. Then the outbound response(downstream) had no any lag and the stream was ended without notifying any error to the caller.

With the proposed change, once the content has decoder exception, it is notified to the application level an error. Until the user specifically send the respond, the downstream is connected. Hence user has to specifically handle this case.

During the meeting with Shafreen, Viraj and Praminda, It was decided to maintain the same behaviour(handle incomplete inbound response by setting a decoder exception as the lastContent) for the 1.2.x and return error from the respond and client action to the user. So based on the requirement, user can decide on sending resetFrame through the new API

chamil321 commented 2 years ago

Clarification on the internal behaviour:
service / on new http:Listener(9090) {
  resource function 'default passthrough(http:Caller caller, http:Request req) returns http:Response|error? {

    http:Response|http:ClientError response = clientEP->forward("/get", req);
    http:ListenerError? result = caller->respond(response);
    if result is ResetInboundStreamError {
      check caller->resetStream(result.detail().get("code"));
    }
  }
}
When above case is implemented, suppose inbound response payload(upstream) is differed for sometime and client time out is triggered. In the previous implementation, the inbound response was completed by setting a decoder exception as the lastContent. Then the outbound response(downstream) had no any lag and the stream was ended without notifying any error to the caller. With the proposed change, once the content has decoder exception, it is notified to the application level an error. Until the user specifically send the respond, the downstream is connected. Hence user has to specifically handle this case.
During the meeting with Shafreen, Viraj and Praminda, It was decided to maintain the same behaviour(handle incomplete inbound response by setting a decoder exception as the lastContent) for the 1.2.x and return error from the respond and client action to the user. So based on the requirement, user can decide on sending resetFrame through the new API

Since the suggestion introduces a behaviour change, it could affect the existing customers. Hence instead of doing it, RST_STREAM was sent when inbound stream is corrupted as the 1.2.x branch fix

shafreenAnfar commented 2 years ago

Ideally, what we can do is push any error to the user level without taking any action. If the user has used check to return the error, then the listener can take the default action based on the error. Resulting in the same behavior we have now.

shafreenAnfar commented 2 years ago

We decided not implement any new API as the listener itself can handle this.

ballerina-platform / ballerina-library