H4ad / serverless-adapter

Run REST APIs and other web applications using your existing Node.js application framework (NestJS, Express, Koa, tRPC, Fastify and many others), on top of AWS, Azure, Huawei and many other clouds.
https://serverless-adapter.viniciusl.com.br/
MIT License
134 stars 8 forks source link

Timing/Buffer issue when using AWSStreamHandler #166

Closed gitaugakwa closed 10 months ago

gitaugakwa commented 10 months ago

Current Behavior

When using the AWS Stream Handler, writes are buffered until either another write occurs or the stream ends. In the case of another write, the second write then experiences the above behaviour.

Expected Behavior

The write to be immediately dispatched and does not depend on another write or the end of the stream

Steps to Reproduce the Problem

const app: Express = express();
app.use("/test", function (req, res, next) {
        //when using text/plain it did not stream
        //without charset=utf-8, it only worked in Chrome, not Firefox
        res.setHeader("Content-Type", "text/html; charset=utf-8");
        res.setHeader("Transfer-Encoding", "chunked");

        res.write("Thinking...");
        sendAndSleep(res, 1);
        // (res as unknown as Writable).on("")
        // return;
});

function sendAndSleep(response: any, counter: number) {
        if (counter > 5) {
                response.end();
        } else {
                response.write(" ;i=" + counter);
                // response.flush();
                counter++;
                setTimeout(function () {
                        sendAndSleep(response, counter);
                }, counter * 1000);
        }
}

const expressFramework = new ExpressFramework();
const framework = new LazyFramework(expressFramework, () => app);

export const handler = ServerlessAdapter.new(null)
        .setFramework(framework)
        .setHandler(new AwsStreamHandler())
        .setResolver(new DummyResolver())
        .setLogger(createDefaultLogger({ level: "info" }))
        .addAdapter(new ApiGatewayV2Adapter())
        // more options...
        //.setFramework(new ExpressFramework())
        .build();

In the above example, you could use the interval that a user receives two numbers to determine the timing. The difference between when 2 is sent and 3 is sent over the network is 4 seconds, which should not be the case. It should be 3 seconds. This shows that 3 is in fact being written but not yet sent, then once 4 is written the 3 is sent over the network.

Environment

H4ad commented 10 months ago

I tested the following code:

const util = require('node:util');

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {
        function sendAndSleep(response, counter) {
            if (counter > 5) {
                    response.end();
            } else {
                    console.log('Writing ' + counter);
                    response.write(" ;i=" + counter);
                    counter++;
                    setTimeout(function () {
                            sendAndSleep(response, counter);
                    }, counter * 1000);
            }
        }

        const finalResponse = awslambda.HttpResponseStream.from(
          responseStream,
          { statusCode: 200, headers: { 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked' }  },
        );
        console.log(util.inspect(finalResponse, { depth: null, getters: true, showHidden: true, showProxy: true }));

        finalResponse.write("Thinking...");

        sendAndSleep(finalResponse, 1);
    }
);

I got the same behavior that my library has, which is:

From what I saw in the logs, the timings are correct:

image

You can enable the INFO RUNTIME STREAM logs by setting the following env variable: AWS_LAMBDA_RUNTIME_VERBOSE=3.

Probably the delay/timing/buffer is caused because the AWS is still trying to open the connection while we try to write to the stream.

You can open an issue on https://github.com/aws/aws-lambda-nodejs-runtime-interface-client to ask the AWS if this behavior is expected or if there is any hint/hack that we can do to speedup the connection.

H4ad commented 10 months ago

I think this is a bug, the following code didn't have that delay of 5s:

const util = require('node:util');

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {      
        responseStream.setContentType('text/html; charset=utf-8');
        responseStream.write("Thinking...");

        setTimeout(function() {
            responseStream.write("After 1s");

            responseStream.end();
        }, 1000);
    }
);

But I don't have idea why.

H4ad commented 10 months ago

The buffer effect is probably caused by the internal highWaterMark of AWS:

const util = require('node:util');
const HANDLER_HIGHWATERMARK = Symbol.for(
  'aws.lambda.runtime.handler.streaming.highWaterMark',
);

const handler = async (event, responseStream, context) => {
        responseStream.setContentType('text/html; charset=utf-8');

        responseStream.write("Thinking...");
        responseStream.write("a".repeat(64 * 1024));

        setTimeout(function() {
            responseStream.write("b".repeat(64 * 1024));
        }, 3000);

        setTimeout(function() {
            responseStream.write("After 6s");
            responseStream.end();
        }, 6000);
    };

handler[HANDLER_HIGHWATERMARK] = 512;

exports.handler = awslambda.streamifyResponse(handler, {
        highWaterMark: 512
    }
);

The default highWaterMark is defined as 64Kb but there is no way to override it.

gitaugakwa commented 10 months ago

Thanks @H4ad I also did a bit of testing on my end and found the issue to be aws buffering what is written to the response.

I wish there was a way to instant send out the data as soon as possible. What I've found to work is having what is written res.write() totaling to ~15,000 bytes. Which is quite an overhead for some small streams that are about 300 bytes large.

Would be nice to have the override.

gitaugakwa commented 10 months ago

I'll close this since the new issue is more concise.

https://github.com/aws/aws-lambda-nodejs-runtime-interface-client/issues/94