sidorares / node-mysql2

:zap: fast mysqljs/mysql compatible mysql driver for node.js
https://sidorares.github.io/node-mysql2/
MIT License
4.06k stars 616 forks source link

Node process does not end and but CPU utilization at 100% #2090

Open nachogiljaldo opened 1 year ago

nachogiljaldo commented 1 year ago

I have been investigating a problem that is causing trouble on our test suite and I managed to trace it to mysql2. This script allows reproducing it:

Reproduction instructions ```js // create database test; // create table test(id int(10) unsigned not null auto_increment, primary key (`id`)); // insert into test(id) values(1); const mysql = require('mysql2/promise'); async function main() { console.log("===========================>", "foo"); const connection = await mysql.createConnection({ host: 'localhost', user: 'root', database: 'test', password: 'rootpass', connectionLimit: 1234, }); for (let ix = 0; ix < 157; ix++) { await basicHanging3(connection); } connection.destroy(); console.log('=========================>', 'Done'); } async function basicHanging3(connection) { let query = 'SELECT '; let limit = 1000; for (let i = 0; i < limit; i++) { query += `t.id AS t_id_${i},`; } query += `t.id AS t_id_${limit}`; query += ' FROM test as t'; const [rows] = await connection.query( query, ); return rows; } main(); ``` ```json { "name": "reproducer", "version": "1.0.0", "main": "index.js", "license": "MIT", "dependencies": { "mysql2": "^3.4.2", "yarn": "^1.22.19" } } ```

It happens with multiple node versions (v16.14.0, v18.16.0 are some of those I tested). For node 20.3.1 it happens, but you need to raise the 157 iterations.

In our tests we run thousands of DB queries which seems to be reaching this threshold and causing the process to "hung" (it eventually exits, but it takes a few minutes). As you can imagine this is a huge pain on CI because the "exit" takes as much as the test suite itself.

2 other interesting things to note:

Netstat shows the DB connection is closed while the process is waiting:

$ sudo netstat -alpW | grep -i <my_pid> 

shows no output.

nachogiljaldo commented 1 year ago

Sorry, I just noticed https://github.com/sidorares/node-mysql2/issues/2080 and I believe they could be related, my reproduction seems to work correctly with 2.3.0 (which is the one that @rogerho1224 was using and worked for them).

sidorares commented 1 year ago

thanks @nachogiljaldo I can reproduce the issue

nachogiljaldo commented 1 year ago

No worries @sidorares if it helps, it might be intimately related to #2080 since I can confirm that using 2.3.0 fixed the issue for me.

sidorares commented 1 year ago

What is strange is that process hangs with 100% cpu just before exit. I suspect some combination of GC and large number of objects with large number of fields. Profiler does not pick that and shows a very small number of ticks in the main code and in the GC, the ticks during the "frozen" part are in "unaccounted" section

The difference with 2.3.0 is highlighted in #2055, I'll try to see if changing that part only helps.

sidorares commented 1 year ago

@nachogiljaldo what is your os / architecture? Not sure if that matters, I'm seeing the issue on m1 mac

nachogiljaldo commented 1 year ago

I'm not sure if it's GC related, (not a node internals expert,but) I ran my reproduction with --trace-gc and the gc logs stop right with the "Done" message.

nachogiljaldo commented 1 year ago

Mine is Ubuntu x86_64. But a colleague of mine has the same issue with similar effects with M1.

sidorares commented 1 year ago

typically GC related performance problems are visible in the profiler log, maybe we see a bug in GC The script works fine even with large number of repeats and only shows problem just before the exit

sidorares commented 1 year ago

@nachogiljaldo I reduced script to a version with no external dependencies, could you try to run first file from https://gist.github.com/sidorares/128160e6b3dea1da3ad45cd672651d2d ? Still not sure what's going on, but switching to setting fields in the constructor hides the issue ( repro2.js )

cc @testn

sidorares commented 1 year ago

Actually it does finish, eventually. First script from gist finishes in about 5 minutes time I think we are hitting some V8 edge case where performance is not great ( o(n^2) for the number of fields? )

sidorares commented 1 year ago

I think I'll try to file this to the core node

sidorares commented 1 year ago

https://github.com/nodejs/node/issues/48581

sidorares commented 1 year ago

since there is reliable repro without dependency on mysql2 I'll try to "fix" the issue by converting parser to add fields in the constructor

Actually, I think the better solution would be to cap the number of fields in dynamically compiled parser, if the number is over the limit use "interpreting" parser instead ( e.g in each row iterate over every column and read the code based on the column type ). We used to have "interpreting" parser a while ago - https://github.com/sidorares/node-mysql2/blob/04f07f82714587049eb7cf1c92f983da34de87c5/lib/commands/query.js#L59-L69

sebdec commented 1 year ago

Not sure if this related. We reproduced the same kind of problem few time ago when we migrated from another mysql driver. We noticed an important increase of the cpu usage on the nodejs server and more slow queries. What was strange is when scaling horizontally or vertically the server, not the db, the slow queries decreased.

We thought that the problem came from our side since few issues were open on the repository and one of the purpose of this lib was a performance gain. But we finally reverted the change of the driver and the problem is gone.

Indeed maybe it's when we manage many objects with many relations.

We use RDS mysql with ECS on fargate, v16.20 node version, mysql2@2.3.0 If necessary i can give other information.

sidorares commented 1 year ago

@sebdec do you know roughly the number of columns and rows you are reading in an average query?

sidorares commented 1 year ago

@sebdec @nachogiljaldo @rogerho1224 could you try with --noopt or --max-opt=2 flags passed to node

nachogiljaldo commented 1 year ago

Just tested with --noopt and it finished successfully almost immediately.

nachogiljaldo commented 1 year ago

Also @sidorares I tested the first file as you suggested (repro1) and it reproduces.

stefkkkk commented 10 months ago

we are currently using this module and our session inits into apps spent a lot of cpu, sometimes >1 core, but after auth is everything fine.

We already done cpu profiling and seeing that most of ticks it's /usr/local/bin/node and mysql2 library

How to deal with that broken but important module? Any hotfix? Any patches? Any workarounds?

sidorares commented 10 months ago

@stefkkkk some possible options: 1) --noopt or --max-opt=2 flags 2) check if you have responses with more rows than you actually need ( SELECT * ) and reduce number of fields to what is actually used

What is your node version? The root cause is in node optimiser so hopefully its fixed eventually on a node side

stefkkkk commented 10 months ago

2023-12-15_10-53 The fix for me was to upgrade node js version from 20.10 to 21.4, and inadequate cpu usage gone Probably, it's due to updates of the V8 JavaScript engine to 11.8 in 21 version of node js @sidorares thanks for help

nachogiljaldo commented 6 months ago

Mmm... in my environment (node20, linux) this was fixed on 3.9.4 (i.e. I was able to upgrade from the previous version I was running 2.7.0).

I tried to upgrade to 3.9.7 today and the problem started to happen again, traced it to 3.9.5 so there's something in that PR that broke the issue that had been inadvertently fixed?

EDIT: tried the noopt and max-opts alternatives but that's not viable as it makes our build absurdly slow (and some tests even time out).

sidorares commented 6 months ago

@nachogiljaldo I'm pretty sure inheriting result from null fixes/hides the problem, unfortunately this comes with it's own issues ( #2585 )

I'm still keen to review the way resulting object is constructed - see https://github.com/sidorares/node-mysql2/issues/2055

sidorares commented 6 months ago

Also a possibility could be a configurable / pluggable result construction, e.i some users might prefer {} as base, some Object.create(null) and some new Map()

nachogiljaldo commented 6 months ago

Also a possibility could be a configurable / pluggable result construction, e.i some users might prefer {} as base, some Object.create(null) and some new Map()

That seems potentially as a good idea, but not sure how feasible it would be in practice for cases when one doesn't connect directly to mysql but rather use an ORM.