oracle / node-oracledb

Oracle Database driver for Node.js maintained by Oracle Corp.
http://oracle.github.io/node-oracledb/
Other
2.26k stars 1.08k forks source link

DPI-1001: out of memory #1186

Closed zymon closed 4 years ago

zymon commented 4 years ago

Selecting many columns of type VARCHAR2(4000 BYTE) causes 'out of memory' error. I have a query that selects 22 columns, 6 of them are defined as VARCHAR2(4000), 14 are of type NUMBER, and several others as VARCHAR2 (100 BYTE, 400 BYTE, 3 BYTE, 1 BYTE) Total length of all character columns in bytes is: 24 504 bytes. When removing one column from the select I'm able to get the results. But after reloading the page several times I get the 'out of memory' error again. Looks like some buffer gets overflown and it's not cleared correctly.

anthony-tuininga commented 4 years ago

Please provide the table creation SQL and the code that you are running to get the error. We will also need the Oracle client and database versions you are using, the node-oracledb version you are using and the platforms you are using. We have performed many much larger queries (with varchar2(32767) columns) and not run into any difficulties.

zymon commented 4 years ago

Hi, Attached is the script that creates the table. I've removed all constraints definitions. The problem happens also when the table is empty, so looks like it fails when tries to allocate the buffer based on the metadata column definition.

OS: Red Hat Enterprise Linux 7, version 7.6 Oracle client: Client Shared Library 64-bit - 19.3.0.0.0     System name: Linux     Release: 4.1.12-124.19.2.el7uek.x86_64     Version: #2 SMP Fri Sep 14 08:59:15 PDT 2018     Machine: x86_64

DB version: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Node-oracledb: 4.1.0 Node version: 10.16.3

Attachments:

measure.pks.txt measure.pkb.txt table_create.txt

Code that we execute: const statement = 'BEGIN dfm_measure.getMeasures(:p_country_id, :o_cur); END;' const bindings = [{"dir":oracledb.BIND_IN,"type":oracledb.STRING,"val":"AT"},{"dir":oracledb.BIND_OUT,"type": oracledb.CURSOR}]

conn.execute(statement, bindings, executeOptions)

Exception: 'Error: DPI-1001: out of memory'

anthony-tuininga commented 4 years ago

So what is the value of executeOptions?

zymon commented 4 years ago

Sorry, forgot about it:

const executeOptions: oracledb.ExecuteOptions = { outFormat: oracledb.OUT_FORMAT_OBJECT, autoCommit: true, }

Do you have any clue when DPI-1001 might happen?

cjbj commented 4 years ago

@zymon please give a runnable .js script that reproduces the problem.

cjbj commented 4 years ago

@zymon we do want to solve this, so the sooner you give us a testcase, the sooner we can review it. Opening a new issue with no extra info didn't give us the testcase we need to understand the problem. Review https://stackoverflow.com/help/minimal-reproducible-example

zymon commented 4 years ago

Hi, I've created a sample file but single file works fine. The problem starts when running it with express.js and pm2. I've played with pool size and the stmtCacheSize. Setting the stmtCacheSize to 0 caused that application worked longer but finally most of the requests started failing with the DPI-1001 error.

dbconfig.js.txt package.json.txt index.js.txt

dmcghan commented 4 years ago

Create a test case that uses Express and PM2. Then share it with us with instructions on how to reproduce the error.

zymon commented 4 years ago

I've managed to get the same error with running just node (node index.js). Attached is the working example. This program runs the same procedure in a loop (the procedure doesn't return any data) 10000 times and never reaches the end - max was 600. Sometimes fails after 200 iterations, sometimes just after 20. In our real case this procedure is not called so often but we're executing many other stored procedures and then it fails even quicker. I've tried to play with setting the statement cache size but it helps only for some time and then Node crashes anyway with DPI-1001. PM2 only makes this problem more visible as it crushes faster. And it is failing faster when more instances are running (in cluster mode).

How actually the Node shares the memory with the ODPI-C library?

out_of_mem.zip

dmcghan commented 4 years ago

While I haven't run the code yet, I'm spotting lots of issues - some small others more serious. In the async function storedProc, why are you using promise chains explicitly?

I'm not sure when I'd be able to give you a rewrite of storedProc, but if you'd like to do it yourself sooner, here are the goals:

  1. Eliminate the promise chain in the async function. Async functions are all about simplifying the code.
  2. Do all your work in a try-catch-finally statement as shown here: https://jsao.io/2017/07/how-to-get-use-and-close-a-db-connection-using-async-functions/#async-function-demo-app
  3. Eliminate use of Promise.all, that's probably the biggest problem in the code and likely responsible for the issues you're seeing. When you use Promise.all, you think you're running code in parallel, but you're not. Connections act as serialization devices, allowing only one operation at a time. What you end up doing, in reality, is throwing the responsibility of properly handling async work in your code to the JavaScript engine to do it for you (I've never seen this end well). Instead, use basic for loops with await to iterate over your result sets getting the values you need and then close them. This will not run any slower.

See if you can do the rewrite and let us know how it goes. If I can find some free time in the coming weeks (I'm supposedly on vacation soon), I'll give it a shot. But it's probably best for you to do it anyway as a learning exercise.

zymon commented 4 years ago

Ok. Thanks a lot. Will try.

zymon commented 4 years ago

Here's version without Promise.all. Still fails, but uses less resident memory.

index.zip

dmcghan commented 4 years ago

@zymon Wow, great work, this is looking much better! I can kick the tires with this...

Just so you know, the explicit promise you created on line 11 doesn't really add anything so you could simplify the code even more. Keep in mind that async functions return promises automatically. If an error is thrown the promise is rejected, and if a value is returned it is resolved.

There are times when explicit promises in async functions add value, this just isn't one of them.

dmcghan commented 4 years ago

I ran a quick test and it finished without error. I increased the main loop to 1,000 and that also finished without error. Here are my final stats:

Db result count: 0 Db procedure called 1000:: heap used: 5.92 MB, heap total: 9.73 MB, rss: 58.52 MB, external: 0.01 MB Before closing pool:: heap used: 5.93 MB, heap total: 9.73 MB, rss: 58.52 MB, external: 0.01 MB Connection closed Pool closed After closing pool:: heap used: 5.96 MB, heap total: 9.73 MB, rss: 55.18 MB, external: 0.01 MB Before GC:: heap used: 5.97 MB, heap total: 9.73 MB, rss: 55.2 MB, external: 0.01 MB After GC:: heap used: 5.97 MB, heap total: 9.73 MB, rss: 55.2 MB, external: 0.01 MB

Our versions of node-oracledb are the same, as are our versions of Node.js. However, I'm on mac with database 19c and using Instant Client version 18.1.

Let me try a loop with 10,000.

dmcghan commented 4 years ago

The 10,000 count also finished fine, but did use more memory:

Db result count: 0 Db procedure called 10000:: heap used: 10.46 MB, heap total: 15.73 MB, rss: 68.05 MB, external: 0.01 MB Before closing pool:: heap used: 10.46 MB, heap total: 15.73 MB, rss: 68.08 MB, external: 0.01 MB Connection closed Pool closed After closing pool:: heap used: 10.49 MB, heap total: 15.73 MB, rss: 64.54 MB, external: 0.01 MB Before GC:: heap used: 10.5 MB, heap total: 15.73 MB, rss: 64.55 MB, external: 0.01 MB After GC:: heap used: 10.5 MB, heap total: 15.73 MB, rss: 64.55 MB, external: 0.01 MB

dmcghan commented 4 years ago

I upgraded my Instant Client to 19.3 for Mac, pretty much the same:

Db result count: 0 Db procedure called 10000:: heap used: 10.63 MB, heap total: 15.23 MB, rss: 62.75 MB, external: 0.01 MB Before closing pool:: heap used: 10.63 MB, heap total: 15.23 MB, rss: 62.79 MB, external: 0.01 MB Connection closed Pool closed After closing pool:: heap used: 10.66 MB, heap total: 15.23 MB, rss: 59.36 MB, external: 0.01 MB Before GC:: heap used: 10.67 MB, heap total: 15.23 MB, rss: 59.39 MB, external: 0.01 MB After GC:: heap used: 10.67 MB, heap total: 15.23 MB, rss: 59.39 MB, external: 0.01 MB

I don't see any memory leaks, just fluctuations with garbage collection.

cjbj commented 4 years ago

Aside from memory use, the given code has a security hole because storedProc() doesn't sanitize the name parameter. You must make sure that name is actually a procedure name, and doesn't contain any arbitrary PL/SQL code from a malicious user.

zymon commented 4 years ago

Thank you for your checks. I've run the script on another machines and it's working fine. This seems to be related to the VMWare we're using.

zymon commented 4 years ago

I found it interesting that there's really big difference in terms of used rss memory when using Promise chain and Promise.all vs no promises.

dmcghan commented 4 years ago

Can you show us the difference? I'm just curious to see what you're seeing... :)

zymon commented 4 years ago

Sure. Attached is the excel with all data and charts. Memory_comparison.xlsx

dmcghan commented 4 years ago

Wow, that is quite a difference! Thanks for sharing.

AchrFerc commented 2 years ago

Hello guys, could you please specify what might be the reasons behind the "DPI-1001: Out of memory" appearance in general? Thx.

anthony-tuininga commented 2 years ago

Any memory that is allocated internally by ODPI-C will raise this error when insufficient memory is found (effectively, malloc() fails). You can find out which is the problem by setting the environment variable DPI_DEBUG_LEVEL to the value 8 and then run the script that results in the issue.

If you have further questions about this, please open a new issue. Thanks!