ballerina-platform / ballerina-library

The Ballerina Library
https://ballerina.io/learn/api-docs/ballerina/
Apache License 2.0
137 stars 64 forks source link

[Bug] Reading a byte stream from a backend hangs in the passthrough #4671

Open TharmiganK opened 1 year ago

TharmiganK commented 1 year ago

Description:

$Subject

Steps to reproduce:

  1. Add a sample pdf file(sample.pdf) to resources directory.

  2. Run the following backend service :

import ballerina/io;
import ballerina/http;

service /file on new http:Listener(8080) {

    // Using the caller to responsd
    resource function get pdf/v1(http:Caller caller) returns error? {
        check caller->respond(check io:fileReadBlocksAsStream("resources/sample.pdf"));
    }

    // Using the response object
    resource function get pdf/v2() returns http:Response|error {
        http:Response res = new;
        res.setByteStream(check io:fileReadBlocksAsStream("resources/sample.pdf"));
        return res;
    }
}
  1. Run the passthrough service :
import ballerina/io;
import ballerina/http;

http:Client clientEP = check new("http://localhost:8080");

service on new http:Listener(9090) {

    // Directly return the response from the backend
    resource function get test() returns http:Response|error? {
        return check clientEP->/file/pdf/v1;
    }

    // Return using a caller after reading the response from the backend
    resource function get test/v1(http:Caller caller) returns error? {
        http:Response backendResp = check clientEP->/file/pdf/v1;
        stream<byte[], io:Error?> streamer = check backendResp.getByteStream();
        return caller->respond(streamer);
    }

    // Return using a new response object after reading the response from the backend
    resource function get test/v2() returns http:Response|error {
        http:Response backendResp = check clientEP->/file/pdf/v2;
        stream<byte[], io:Error?> streamer = check backendResp.getByteStream();
        http:Response response = new;
        response.setByteStream(streamer, "application/test+pdf");
        return response;
    }
}
  1. Use the following client file to execute different requests :
import ballerina/log;
import ballerina/http;

public function main(string url) returns error? {
    log:printInfo("sending request to " + url);
    http:Client httpClient = check new(url);
    http:Response response = check httpClient->/;
    log:printInfo("response received", status\-code=response.statusCode, 
                  content\-type=response.getContentType());
}
command output
bal run client.bal -- http://localhost:8080/file/pdf/v1 works fine
bal run client.bal -- http://localhost:8080/file/pdf/v2 getting a timeout
bal run client.bal -- http://localhost:9090/test getting a timeout
bal run client.bal -- http://localhost:9090/test/v1 getting a timeout
bal run client.bal -- http://localhost:9090/test/v2 getting a timeout

Affected Versions:

Ballerina SwanLake 2201.3.2

chamil321 commented 1 year ago

Instead of accessing the byteStream inside the passthrough service, respond with the response.

resource function get test/v1(http:Caller caller) returns error? {
        http:Response backendResp = check clientEP->/file/pdf/v1;
        return caller->respond(backendResp);
    }
lilRaptor99 commented 1 year ago

@TharmiganK Can I Know the rough timeline for this issue? Is this issue prioritized, and When will this be fixed?

TharmiganK commented 1 year ago

@lilRaptor99 we have not prioritised this yet. If this is a blocker for you, you could try reading the file as byte[] rather than a stream.

Sample backend service :

import ballerina/io;
import ballerina/http;

service /file on new http:Listener(8080) {

    resource function get pdf/v1() returns byte[]|error {
        byte[] & readonly file = check io:fileReadBytes("resources/sample.pdf");
        return file;
    }
}

Sample passthrough service :

import ballerina/http;

final http:Client clientEP = check new("http://localhost:8080");

service on new http:Listener(9090) {

    resource function get test() returns http:Response|error {
        byte[] file = check clientEP->/file/pdf/v1;
        http:Response res = new;
        res.setBinaryPayload(file, "application/pdf");
        // Do any other processing
        return res;
    }
}
lilRaptor99 commented 1 year ago

This is not a blocker at the moment. But this will be an issue in the future. Its better if we can get this fixed within at least 2 months. I need to use a steam because the file may be large, and storing it in memory (in a variable) is not really a good solution; Piping the data stream directly to the client is a better approach.

TharmiganK commented 1 year ago

This is not a blocker at the moment. But this will be an issue in the future. Its better if we can get this fixed within at least 2 months. I need to use a steam because the file may be large, and storing it in memory (in a variable) is not really a good solution; Piping the data stream directly to the client is a better approach.

Ack, will try to prioritise this in the next release 👍

TharmiganK commented 1 year ago

Attaching a simple reproducer for this issue:

  1. backend service :
    
    import ballerina/io;
    import ballerina/http;

service /file on new http:Listener(8080) {

resource function get pdf/v1(http:Caller caller) returns error? {
    stream<io:Block, io:Error?> fileStream = check io:fileReadBlocksAsStream("resources/sample.pdf");
    check caller->respond(fileStream);
}

}


2. passthrough service :
```bal
import ballerina/io;
import ballerina/http;

final http:Client clientEP = check new("http://localhost:8080");

service on new http:Listener(9090) {

    resource function get test() returns http:Response|error {
        http:Response backendRes = check clientEP->/file/pdf/v1;
        stream<byte[], io:Error?> fileStream = check backendRes.getByteStream();
        http:Response res = new;
        // Do any other processing
        res.addHeader("File-Name", "sample.pdf");
        res.setByteStream(fileStream, "application/pdf");
        return res;
    }
}
dilanSachi commented 1 year ago

When analysing this, found out that, returning the http:Response rather than using caller->respond has some multiple back and forth calls among ballerina and the native runtime. So after this multiple calls, the runtime.invokeasync method is not handled properly by the schedular. This will be analyzed and fixed by the runtime. Ill tag the issue here as well.

dilanSachi commented 1 year ago

Transfered this to be followed up from runtime. Will create a separate issue to keep track of this from http module.

dilanSachi commented 1 year ago

Will add a little detail for this.

import ballerina/io;
import ballerina/http;

service /file on new http:Listener(8080) {

    // Using the caller to responsd
    resource function get pdf/v1(http:Caller caller) returns error? {
        check caller->respond(check io:fileReadBlocksAsStream("resources/sample.pdf"));
    }

    // Using the response object
    resource function get pdf/v2() returns http:Response|error {
        http:Response res = new;
        res.setByteStream(check io:fileReadBlocksAsStream("resources/sample.pdf"));
        return res;
    }
}

For the above service, pdf/v1 (case 1) works and pdf/v2 (case 2) hangs. In the implementation, Both of these scenarios go and invoke the mime:next function via the mime-native using runtime.invokeAsync call [1] .

But for case 1, it doesnt go back and forth among ballerina and java. But for case 2, we return from the service to the notifySuccess of the http-native and then we invoke a ballerina FunctionA via that. In the FunctionA we again call a native method MethodA. Inside the MethodA we again call a ballerina function FunctionB. Here when we call the runtime.invokeMethodAsync, we use a CountDownLatch to get the result synchronously. However this gets stuck and the ballerina function FunctionB does not get invoked until the CountDownLatch timesout.

[1] https://github.com/ballerina-platform/module-ballerina-mime/blob/master/native/src/main/java/io/ballerina/stdlib/mime/util/EntityBodyHandler.java#L370