Open Dingway98 opened 1 month ago
Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link
Hi @Dingway98, thank you for opening the issue.
Please allow us some time to triage the issue and try reproducing the behavior.
I'll get back to you here as soon as I have news to share.
To help narrow things down for you I went back to version 0.2.0
(way back) and this for the same function produced a normal amount of memory usage: Max Memory Used: 125 MB
.
But this happens in both the latest v1 and latest v2, so something was added in v1/pre-v1 that is causing this.
Edit: it is somewhere between 0.7.2 (good) and 0.8.1 (bad) Edit: From my testing 0.8.1 is the culprit with PR https://github.com/aws-powertools/powertools-lambda-typescript/pull/748/files
This does not look an issue on powertools, but rather on the AWS Image for Node, if we replace the logger to:
import { Console } from 'console';
const logger = new Console({
stdout: process.stdout,
stderr: process.stderr,
})
We still have the memory issue.
A quick fix for powertools is just to remove the reference for Console
and use directly the global console
object.
Edit: I tried to test that sample of code on node locally but I didn't see any memory issue, I also tried to take a look at the nodejs runtime and they only do a simple log patch but didn't modify/use the Console object.
Edit2: On node 20.x we also have the error with memory.
I did some investigation and looks like the issue is related to the stdout
and stderr
not being synchronous and the code not giving the event loop time necessary to flush the logs.
A heap snapshot showed the WritableState
was holding all the memory:
This data is being buffered because the logs when using Console
and stdout
are not blocking, and they are being buffered until the next tick when they are flushed.
If you add something like await new Promise(resolve => setTimeout(resolve, 2));
just to give to Node.js some time to flush the logs, you will not see the memory issue.
Also, you can make the stdout
and stderr
synchronous by putting the following code:
if (process.stdout._handle) process.stdout._handle.setBlocking(true);
if (process.stderr._handle) process.stderr._handle.setBlocking(true);
Anyway, feel free to choose between one of these solutions, I can open I PR when you choose how do you want to proceed.
Hi @H4ad, thank you SO much for this investigation and for also providing recommendations!
I've been trying to understand the implications of setting the process.stdout._handle
to blocking and from the little info I could find on it in other discussions the consensus seems to be fairly split and in many cases leaning towards it having a detrimental impact on performance:
While I can verify that setting these two lines drops the memory usage to similar levels to the regular console.log
, I'd like to spend a bit more time investigating and understanding the root cause & solution a bit better.
To try isolating the issue, I created two new test cases, neither of which uses the Powertools for AWS Logger and that still reproduce the issue. The main difference in both these tests is that we are stringifying the log similar to how Logger would.
In sample 1, I am using the patched console.log
as it appears in the Node.js RIC here:
const serviceName = 'exampleService';
export const handler = (event: unknown, context: unknown) => {
const largeObject = {
prop1: 'prop',
prop2: 'prop',
prop3: 'prop',
prop4: 'prop',
prop5: 'prop',
prop6: 'prop',
prop7: 'prop',
prop8: 'prop',
prop9: 'prop',
prop10: 'prop',
prop11: 'prop',
prop12: 'prop',
prop13: 'prop',
prop14: 'prop',
prop15: 'prop',
prop16: 'prop',
prop17: 'prop',
prop18: 'prop',
prop19: 'prop',
prop20: 'prop',
prop21: 'prop',
prop22: 'prop',
prop23: 'prop',
prop24: 'prop',
prop25: 'prop',
prop26: 'prop',
prop27: 'prop',
prop28: 'prop',
};
console.log(JSON.stringify(event));
console.log(JSON.stringify(context));
console.log('hello world');
for (let i = 0; i < 100000; i++) {
const line = JSON.stringify({
timestamp: new Date().toISOString(),
level: 'INFO',
service: serviceName,
message: 'hello world',
...largeObject,
});
console.log(line);
}
return 'done';
};
In sample 2, I am instead printing directly to process.stdout
:
const serviceName = 'exampleService';
export const handler = (event: unknown, context: unknown) => {
const largeObject = {
prop1: 'prop',
prop2: 'prop',
prop3: 'prop',
prop4: 'prop',
prop5: 'prop',
prop6: 'prop',
prop7: 'prop',
prop8: 'prop',
prop9: 'prop',
prop10: 'prop',
prop11: 'prop',
prop12: 'prop',
prop13: 'prop',
prop14: 'prop',
prop15: 'prop',
prop16: 'prop',
prop17: 'prop',
prop18: 'prop',
prop19: 'prop',
prop20: 'prop',
prop21: 'prop',
prop22: 'prop',
prop23: 'prop',
prop24: 'prop',
prop25: 'prop',
prop26: 'prop',
prop27: 'prop',
prop28: 'prop',
};
for (let i = 0; i < 100000; i++) {
let line = JSON.stringify({
timestamp: new Date().toISOString(),
level: 'INFO',
service: serviceName,
message: 'hello world',
...largeObject,
});
line = line.replace(/\n/g, '\r');
process.stdout.write(`${line}\n`);
}
return 'done';
};
I then tested both:
console.log
) - Duration: 2308.49 ms Billed Duration: 2309 ms Memory Size: 10240 MB Max Memory Used: 77 MB Init Duration: 203.04 ms
process.stdout.write
) - Duration: 1585.19 ms Billed Duration: 1586 ms Memory Size: 10240 MB Max Memory Used: 374 MB Init Duration: 213.40 ms
Both these tests have shorter memory usage and duration from the ones in the OP because I have reduced the amount of logs being printed by 1/7th, however we can still see that when using process.stdout.write
the memory usage is several orders of magnitude higher than when using console.log
.
Setting the process.stdout._handle
to blocking in the second sample removes the memory usage and brings it to a similar level to the console.log
sample.
This data is being buffered because the logs when using Console and stdout are not blocking, and they are being buffered until the next tick when they are flushed.
This makes sense, but then why is this line not affected since the patched console.log
also writes to stdout
??
I'm going to reach out to the Lambda team to see if they can look into this and have an explanation on why we're seeing this behavior.
This makes sense, but then why is this line not affected since the patched console.log also writes to stdout??
Good question, I read the source code of Node and I didn't found a difference between using Console
or using process.stdout.write
or console.log
.
This difference seems to be something particular only to AWS, maybe the implementation is patched again in another part of the code.
Hi @H4ad! Thanks a lot for looking into this issue and giving us some direction to fix it. We are working with Lambda engineers to understand the root cause of this issue because as you mentioned, it seems like something very specific to the node embedded in the Lambda node image.
Você é o cara! ❤️
Hello,
We're still waiting on the internal team from AWS Lambda to provide a path forward within the runtime. They are aware of the issue and have already started looking into it.
We are in close contact with them and will share more info here as soon as we can.
Expected Behavior
logger.info
or any logger calls should not be using excessive memory usage withMax Memory Used: 3339 MB
This is opposed to straight
console.log
, where we getMax Memory Used: 83 MB
Current Behavior
See the executions logs.
When we run the simple code snippet with powertools logger
logger.info
, high memory usage is recorded.Max Memory Used: 3339 MB
Code snippet
Steps to Reproduce
Code snippet
@aws-lambda-powertools/logger
, and importPossible Solution
Could this be related to the internal buffer powertools logger is setting? There is a buffer before the logger instance is initialized, but I am unsure what is happening during initialization.
Powertools for AWS Lambda (TypeScript) version
2.7.0
AWS Lambda function runtime
18.x
Packaging format used
npm
Execution logs
with console.log: