brettwooldridge / HikariCP

光 HikariCP・A solid, high-performance, JDBC connection pool at last.
Apache License 2.0
19.91k stars 2.92k forks source link

BoneCP to HikariCP migration slowing down the DB with "Create Sort Index" #2137

Open Kiranbsharma opened 11 months ago

Kiranbsharma commented 11 months ago

Hi @brettwooldridge

Issue We are migrating from Amazon EMR 5.x to EMR 6.8. EMR 5.x uses BONECP as connection pooling to connect to Hive metastore (RDS MySQL DB). EMR 6.8 uses HikariCP to connect to metastore mySQL DB.

The backend RDS DB became extremely slow with HikariCP. I created a simple test to fire 100 queries and tested it with BoneCP and HikariCP.

All the DB connections are consumed by resource intensive "Create Sort Index" queries when I use HikariCP. I noticed that BoneCP issues the queries little slower than HikariCP so I added 7 sec sleep between my queries and "Create Sort Index" queries did not start. I have tested it multiple times with BoneCP and HikariCP and the "Create Sort Index" gets started only when its HikariCP + DB is stressed with high number of queries. After all connections are consumed by "Create Sort Index", then we start getting connection timeout. I changed maximumPoolSize=100 and it avoided connection timeout but "Create Sort indexes" still started. We have r5.4xlarge DB (16 vCPU) --> recommended pool size is 32 so it did not seem like a right solution to increase maximumPoolSize.

Question: Is there any way to control the rate at which queries are issues to the DB?

Current settings
Below are my setting connectionTimeout = 30000 maximumPoolSize = 10. (experimented with 5,8,32,100) minimumIdle = 10

Tests I have tested with https://github.com/brettwooldridge/HikariCP/wiki/MySQL-Configuration but no luck

I also compared sessions setting between BoneCP and HikariCP and changed autoCommit=false and transactionIsolation to TRANSACTION_READ_COMMITTED but issue persists.

We have been struggling to find the fix. Please let us know if you have any recommendations.

Thank you Kiran

lfbayer commented 10 months ago

If you can identify the root cause of your problem and create a unit test reproducing the behavior, I would be happy to take a look. But to be honest, I doubt this is caused by a bug in HikariCP. It could be a coincidence. Or it could be that BoneCP just happened to do something differently in just the right way to cause the problem to not present.

Either way, a unit test that reproduces the issue would be useful.

Even better would be if you have a pull request that also includes the fix for the problem.