oracle / node-oracledb

Oracle Database driver for Node.js maintained by Oracle Corp.
http://oracle.github.io/node-oracledb/
Other
2.25k stars 1.07k forks source link

Connection problem when a lot of connection requests comes up in milliseconds. #1493

Closed alichampion closed 2 years ago

alichampion commented 2 years ago
  1. What versions are you using?
  1. Is it an error or a hang or a crash? Yes it's an error and hang as well.

  2. What error(s) or behavior you are seeing?

Hi, @dmcghan I'm working on a project where 30-40 requests come up within milliseconds.

And here's the flow at a high level...

This all works fine till a point when a lot of new connection open requests pop up connection gets timed out and the connection open takes some 15-90 seconds and sometimes a few milliseconds.

Now my belief in the above explanation is, that DB is unable to handle such an amount of connections and gets timed out but the DBA does not accept this explanation.

So what I did now, is I created a singleton class for both pools and used it, so in this case, a connection will open once and we will not close It from the code now with this implementation I know the connection will be expired or can be broken due to various reasons so singleton class should be so flexible that it can update the connection in singleton class and it's being done using oracledb.connection.ping().

But in this approach, I feel that query execution is slow for each request what do you think? And when 250+ subsequent requests lands on a single connection it gives ORA-01000 error.

Can you check how can I optimize the queries?

I have attached the OralceDB.txt file which holds JS code for my singleton OracleDB class.

OracleDb.txt

cjbj commented 2 years ago

@alichampion can you give the pool size values and also what size you have set for UV_THREADPOOL_SIZE? Also show how you set this environment variable.

What error numbers do you get?

What do your pool statistics show at the time your app has a problem?

Review all the node-oracledb documentation on pooling for some good information.

From a quick look at your code, you seem to be setting pool aliases but not using the aliases to access the pools themselves. This is probably OK (e.g. for logging) but it seems to be adding complexity. With pools you must close connections when you don't need them. Your original statement said you were doing this but your later experiment with "a connection will open once and we will not close It from the code now" isn't going to be usable or scalable.

alichampion commented 2 years ago

Hi, @cjbj As you can see it's being set in app.js line no 7

Screenshot 2022-04-19 at 12 24 15 PM

When working with a single connection upon a lot of requests in a millisecond I get an ORA-01000 error and working with multiple connections upon a lot of requests I get an ORA-03135 error and sometimes we get an ORA-2396 error.

And yes there is another file from where we create pools and open them on application startup.

I had concluded that the error is due to the connection problems with the DB as I told in my question, so I decided to open only one connection and don't close it but later I realized that queries are taking much time so IDK now what to do.

I have also thought of using Database Resident Connection Pool (DRCP) but we have some limitations from the DBA side so we can not do this.

Thanks.

cjbj commented 2 years ago

Since you're on Windows setting UV_THREADPOOL_SIZE inside the app has no effect. Even on Linux it's better to set it before the apps starts. See the doc: https://oracle.github.io/node-oracledb/doc/api.html#numberofthreads

Can you give the pool creation parameter values?

Have you reviewed how the pool alias is used in https://github.com/oracle/node-oracledb/blob/main/examples/connectionpool.js and https://github.com/oracle/node-oracledb/blob/main/examples/webapp.js ?

alichampion commented 2 years ago

Hi, @cjbj

let me try setting it up before the app start.

Here are the pool creation parameters and unfortunately, I can't share the values.

However, the poolMax value is 150.

Screenshot 2022-04-19 at 1 06 20 PM

And yes I have reviewed the documentation and I'm using pool aliases same as mentioned there.

Thanks

alichampion commented 2 years ago

Hi, @cjbj is it possible for you to have a quick call with me?

alichampion commented 2 years ago

Hey, @cjbj I have also noticed that when 250 concurrent requests are completed successfully and I send another 250 concurrent requests my docker container exits with error code 137 which means Not enough Ram is available.

I'm unsure if the docker exit problem is related to the open cursor.

Can you assist me, please?

Thanks in advance.

cjbj commented 2 years ago

Did you sort out the UV_THREADPOOL_SIZE setting? How are you now setting it?

Did you have solid architectural reasons for using dynamic pool sizes (where poolMin is not the same as poolMax) and the (small) pool increment size? See Connection Pool Sizing and try using fixed size pools.

Why do you need two pools?

Regarding cursors, are you using bind variables properly? I haven't seen anyone mention ORA-1000 for a long time so my first thoughts are that the app isn't pooling properly. For what it's worth, you can check out the old post https://blogs.oracle.com/opal/post/node-oracledb-avoiding-ora-01000-maximum-open-cursors-exceeded

Please share the pool stats I asked for before (Note that the stats show the UV_THREADPOOL_SIZE environment variable value but the Node.js thread pool may have started with a lower size - see the doc)

You might like to read https://cjones-oracle.medium.com/always-use-connection-pools-and-how-909bc2c65444

alichampion commented 2 years ago

Thanks, @cjbj I'll check your recommendations and let you know.

alichampion commented 2 years ago

Hi, @cjbj check this out. { "gatheredDate": 1650513341362, "upTime": 57495868, "upTimeSinceReset": 57482531, "connectionRequests": 750, "requestsEnqueued": 312, "requestsDequeued": 0, "failedRequests": 0, "rejectedRequests": 0, "requestTimeouts": 312, "maximumQueueLength": 63, "currentQueueLength": 0, "timeInQueue": 125141, "minimumTimeInQueue": 139, "maximumTimeInQueue": 1110, "averageTimeInQueue": 401, "connectionsInUse": 55, "connectionsOpen": 150, "connectString": "", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "prime", "poolIncrement": 1, "poolMax": 150, "poolMaxPerShard": 0, "poolMin": 150, "poolPingInterval": 60, "poolTimeout": 60, "queueMax": 500, "queueTimeout": 60, "sodaMetaDataCache": false, "stmtCacheSize": 30, "user": "", "threadPoolSize": "350" }

I have followed your blog recommendations except idle connection killing, I feel I can not do this as it has to be done by DBA.

cjbj commented 2 years ago

Hard to comment without knowing the load profile.

Do some debugging and check you are closing connections. The stats show queuing but no dequeueing. So what is holding onto connections? Or is UV_THREADPOOL_SIZE not set.

Overall, reduce all the sizes (pool size, application load) to simplify the scenario and try again.

alichampion commented 2 years ago

I can not understand what you mean by load profile?

I'm releasing the connection as soon as the query is completed rather than closing it.

I have set the UV_THREADPOOL_SIZE in Dockerfile before starting the node process I believe that's the best way right?

pvenkatraman commented 2 years ago

The connectString, user fields are empty. It should show some valid values.

alichampion commented 2 years ago

I did it myself before sharing the stats.

alichampion commented 2 years ago

Hi, @cjbj I did some modifications like used OC-21, close connection after query is completed.

And here is the stats logs.

{ "gatheredDate": 1650875761037, "upTime": 109012, "upTimeSinceReset": 88087, "connectionRequests": 250, "requestsEnqueued": 100, "requestsDequeued": 0, "failedRequests": 0, "rejectedRequests": 0, "requestTimeouts": 100, "maximumQueueLength": 69, "currentQueueLength": 0, "timeInQueue": 34456, "minimumTimeInQueue": 228, "maximumTimeInQueue": 822, "averageTimeInQueue": 345, "connectionsInUse": 150, "connectionsOpen": 150, "connectString": "*", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "", "poolIncrement": 1, "poolMax": 150, "poolMaxPerShard": 0, "poolMin": 150, "poolPingInterval": 60, "poolTimeout": 60, "queueMax": 500, "queueTimeout": 60, "sodaMetaDataCache": false, "stmtCacheSize": 30, "user": "****", "threadPoolSize": "350" }

I don't understand why 100 requests were requestsEnqueued and 100 requests were requestTimeouts.

To me I think Oracle should maintain the queue it self.

Let me know what should I do?

alichampion commented 2 years ago

Hi, @cjbj check this out u can see 150 is poolMin size but opened connections are 131 only and u can also see thread pool size.

{ "gatheredDate": 1650888622747, "upTime": 625843, "upTimeSinceReset": 604582, "connectionRequests": 250, "requestsEnqueued": 0, "requestsDequeued": 0, "failedRequests": 0, "rejectedRequests": 0, "requestTimeouts": 0, "maximumQueueLength": 0, "currentQueueLength": 0, "timeInQueue": 0, "minimumTimeInQueue": 0, "maximumTimeInQueue": 0, "averageTimeInQueue": 0, "connectionsInUse": 0, "connectionsOpen": 131, "connectString": "**", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "**", "poolIncrement": 1, "poolMax": 150, "poolMaxPerShard": 0, "poolMin": 150, "poolPingInterval": 60, "poolTimeout": 90, "queueMax": 500, "queueTimeout": 0, "sodaMetaDataCache": false, "stmtCacheSize": 30, "user": "", "threadPoolSize": "350" }

alichampion commented 2 years ago

Hi, @cjbj

I also got this weird ORA-3156 error code.

error on 1234567_222 [Error: DPI-1080: connection was closed by ORA-3156] { errorNum: 0, offset: 0 }

alichampion commented 2 years ago

Hi, @cjbj

I also got this weird ORA-3156 error code.

error on 1234567_222 [Error: DPI-1080: connection was closed by ORA-3156] { errorNum: 0, offset: 0 }

@cjbj just a nudge on this one.

cjbj commented 2 years ago

@alichampion I've was on vacation. How are you going with this?

alichampion commented 2 years ago

Can you go through the last four questions/explanations?

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not been updated recently. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not been updated for a month.

jatinseth2007 commented 1 year ago

Hello @cjbj,

I am facing the same issue on linux machine - DPI-1080: connection was closed by ORA-2396, "oracledb": "^5.4.0"

connection settings -> poolMax: 25, poolMin: 5, poolIncrement: 5, poolTimeout: 1800, poolPingInterval: 300

One thing, I wanted to understand, if we have poolMin as 5, poolTimeout as 1800, and poolPingInterval 300, in documentation, it's written as after poolTimeout seconds it will kill ideal connections and bring them back to poolMin but what about the connections it's holding as poolMin, will it kill those as well and recreates new to match poolMin?

Our Oracle has a setting to kill connections that are ideal for more than an hour, that's why we kept the poolTimeout as half of their time but still connections went to this error state.

cjbj commented 1 year ago

@jatinseth2007 I opened a new issue for you.