ibmdb / node-ibm_db

IBM DB2 and IBM Informix bindings for node
MIT License
192 stars 150 forks source link

When server drops connection, empty result set returned after only first SQL30081N #327

Closed smchamberlin closed 5 years ago

smchamberlin commented 7 years ago

If the connection is dropped for any reason (I'm able to simulate this with "db2 force application all" on my DB2 server), then the next subsequent attempt to query from that connection will give an appropriate SQL30081N error. However, any attempts after that will erroneously return an empty result set. I feel this is misleading, as it is impossible to distinguish at this point whether the connection is actually working, with an empty result set from the query, or if the connection is still broken and requires a reconnect.

rhtpandeyIN commented 7 years ago

Can you please share the repro steps/program for this? Also what platform and Db2 server you are using?

smchamberlin commented 7 years ago

Sure, i'm connecting to DB2 v10.5.200.109, fix pack 2, 64-bit on Windows. The recreate is to connect with db.openSync, then query with db.query. Rows from my test table are returned fine (data is in the "rows" var of the callback). Then I break the connection with "db2 force applications all" on the server. The first time I try db.query, I get the 30081N value in the err of the callback. Any subsequent calls to the db.query API will not return an err but will give empty rows in the callback.

Is that enough information? And is that working as designed? Instead I would like to have some way to know the connection is dead so that I can try to recover and reconnect. Right now the behavior is indistinguishable from a live connection that is returning an empty result set from the query.

Please let me know if you have any additional questions.

smchamberlin commented 7 years ago

Hi @rhtpandeyIN , can you please confirm if this is working as designed, or real issue ?

Thanks

rhtpandeyIN commented 7 years ago

Please send me a repro test case for this issue, I tried but not able to repro. Thanks in advance.

smchamberlin commented 7 years ago

Ok, just to be clear: when you try it, you get a SQL30081N error every time you try to access a dropped connection? When I try it, i start getting empty result set after the first SQL30081N returned.

bimalkjha commented 7 years ago

@smchamberlin Once you get SQL30081N error, means connection is broken and no longer usable. You need to reconnect, get a new connection and re-execute the query. It is working as designed.

You can change the time interval at last line of test case test-bad-pool-connection-issue-42.js and execute it to test the behaviour. Test case ibm_db/test/test-bad-pool-connection-issue-42.js is there to test the same scenario. Thanks.

NilsGregersen commented 7 years ago

@bimalkjha Hi, we are dealing with this issue through Node-RED. The problem is, that you get the SQL30081N error only one time. After that you get empty results and can't see if the connection is working or not. Why doesn't it throw the SQL30081N error until there is a new connection?

And the problem for @smchamberlin is, that he tries a reconnect. And will not know, if the reconnecting was successful or not. Because if not, he will just get an empty result, but not the SQL30081N error.

@bimalkjha What would you recommend to check, if the reconnect was successful? Because we need to check that and if the reconnecting did not work, we need to try reconnecting again and again.

Thanks in advance.

smchamberlin commented 7 years ago

@bimalkjha I understand your test case, but it is a simple case where a connection is temporarily interrupted and restarted between attempts. What if the connection is down and stays down? Do you have a test case for this? I agree that returning a SQL30081N is appropriate the first time, but why would you return an empty result set on subsequent query attempts on that same broken connection before it is repaired? With this behavior, as NilsGregersen mentioned, it is impossible to know if a recovery or reconnection was successful or not. I think that all attempts to access a broken connection should return an appropriate error code until the connection is repaired, do you agree? Thanks.

bimalkjha commented 7 years ago

@NilsGregersen To know a connection is valid or not, connection is successful or not, check the value of conn.connected. It is used in may test files to check the connection state. See ibm_db/test/test-open-close.js for example. When you use a disconnected connection, you'll get empty result as of current behaviour. When you reattempt connection after SQL30081N, check the value of conn.connected to know connection is successful or not. Thanks.

smchamberlin commented 7 years ago

Hi @bimaljha , I was not aware of this feature, but sounds useful for our needs, why is it not listed in the API documentation here ?

https://www.npmjs.com/package/ibm_db

Do we need to add that ?

smchamberlin commented 7 years ago

Also, I noticed you didn't close this issue yet, though you did say it was working as designed. While it is true that measures can be taken at the application layer to check the integrity of the connection, I think it is dangerous to return what appears to be an empty result set if the connection is not even valid. Why not return an error indicating that the connection has been broken? Is that a feature enhancement instead?

PavaniKa commented 6 years ago

Hi @bimaljha We are using pool in our nodeJs application, and having the same issues. So I'm checking the conn.connected property value and it is always true even after the SQL30081N problem. And these connections are returning empty result sets.

If this is the current behavior to return empty results, please let us know what should we expect in the next release.

bimalkjha commented 6 years ago

@PavaniKa Are you able to run ibm_db/test/test-open-close.js and see incorrect value of conn.connected. Please share a sample test program using which we can see conn.connected property is not showing correct state. I'll try to update code to return some error instead of empty result set. Will look into that so kept the issue open yet. Thanks.

smchamberlin commented 6 years ago

Hi @bimaljha , is this test passing for you ?

I tried running it today, but getting the following error:

Stevens-MacBook-Pro:test smc2$ node test-open-close.js /Users/smc2/node_modules/bindings/bindings.js:91 throw err ^

Error: Could not locate the bindings file. Tried: → /Users/smc2/node-ibm_db/build/odbc_bindings.node → /Users/smc2/node-ibm_db/build/Debug/odbc_bindings.node → /Users/smc2/node-ibm_db/build/Release/odbc_bindings.node → /Users/smc2/node-ibm_db/out/Debug/odbc_bindings.node → /Users/smc2/node-ibm_db/Debug/odbc_bindings.node → /Users/smc2/node-ibm_db/out/Release/odbc_bindings.node → /Users/smc2/node-ibm_db/Release/odbc_bindings.node → /Users/smc2/node-ibm_db/build/default/odbc_bindings.node → /Users/smc2/node-ibm_db/compiled/6.6.0/darwin/x64/odbc_bindings.node at bindings (/Users/smc2/node_modules/bindings/bindings.js:88:9) at Object. (/Users/smc2/node-ibm_db/lib/odbc.js:31:31) at Module._compile (module.js:556:32) at Object.Module._extensions..js (module.js:565:10) at Module.load (module.js:473:32) at tryModuleLoad (module.js:432:12) at Function.Module._load (module.js:424:3) at Module.require (module.js:483:17) at require (internal/module.js:20:19) at Object. (/Users/smc2/node-ibm_db/test/common.js:1:74) Stevens-MacBook-Pro:test smc2$

I think maybe something not setup right in my environment, can you please advise? Thanks.

bimalkjha commented 6 years ago

@smchamberlin Yes all test files ibm_db/test/test-*.js is working fine for me on all platforms. You need to update connection string in test/config.testConnectionStrings.json file before running test files. From the error pasted above, it seems some problem with you installation. Please run npm install -g ibm_db again and then run the test file. Thanks.

PavaniKa commented 6 years ago

@bimaljha Thanks for the response. We will wait for the next release with the fix.

The ibm_db/test/test-open-close.jspasses for us too. But this test is not using a connection pool, but we are using pool in our code. I think the conn.connected property is not getting updated correctly for pooled connections in this scenario.

I will also try to create a different test case when the time permits and submit here.

rjkuro commented 6 years ago

We too are suffering from this problem - after the Db2 scheduled maintenance this month, some connections got stale and return empty result instead of connection errors. Our micro-service has no way to detect if the connection is alive or not, and the problem is observed somewhat intermittently. Only when the micro-service grabs the stale connection, empty result is shown and otherwise everything works fine. It's great if this problem is addressed soon. I am happy if I can be of any help.

kasarol commented 6 years ago

Related issue (depending on this issue): https://github.com/smchamberlin/node-red-nodes-cf-sqldb-dashdb/issues/11

The issue is critical when using Node-RED with node-red-nodes-cf-sqldb-dashdb module as there is no way to check the conn.connected nor if a query produced an error or not. You simple lose all your data until the application is manually restarted. It could be worked-around in that module by forcing reconnection, but at the moment it has not been done.

huineng commented 6 years ago

indeed , when a backend db is being restarted or taken down even for a while, the pool still contains the close connections and we have to make inefficient code to re init the pool etc..

It would be an extremely nice feature if something could be build in that could retry a connection Especially for pools

bimalkjha commented 6 years ago

@huineng Please go through this comment to automatically retry a connection internally. Thanks.

adriantanasa commented 6 years ago

We've got the same error on Bluemix for Db2 on cloud. The connection is reporting a SQL30081N error for the first query after connection is dropped. For the following queries is retrieving successful empty responses. The conn.connected is not automatically updated after the first failure - so we cannot rely on this to reconnect. The work around for us is to manually update conn.connected status once we catch the first error, but I think this should be handled by the ibm_db2 module.

@bimalkjha Do you have instructions for applying automatic retry for Db2 on Cloud (previous DashDb)?

bimalkjha commented 6 years ago

@adriantanasa Probably you have not checked this comment that has the same solution as you are looking for. I have given this link in my last update too. Thanks.

huineng commented 6 years ago

The problem i have with this solution is that database need to be hardcoded . We have one code for our 3 environments dev test and prod and all have a different database environment details passwords and are kept in secrets like kubernetes or docker (one should never display connection details , user id and passwords and store them in git

so i still belief the ultimate solution would to add this to the code with some variables like

retry x times every y minute for z time in case of connection error

PavaniKa commented 6 years ago

@bimalkjha We are getting the correct error from Database, but after the connection is returned to pool any subsequent database calls which uses this connection will get empty result set irrespective of the query/procedure. The scenario is like this...

  1. UI makes an API call
  2. API created a new database connection (Connection Pooling enabled)
  3. API triggered a long running query/procedure from API
  4. Database interrupts the query/procedure
  5. API gets the error back and prints the log
  6. Connection is returned to the pool
  7. UI makes another API call
  8. Empty result sets are returned when API uses the same connection from pool

Please note, we also tried to replicate this behavior by not using a connection pool. And we do not see the empty result sets being returned.

bimalkjha commented 6 years ago

@PavaniKa May I know your platform on which you are working. If it is non-windows, can you install using command npm install git+https://git@github.com/ibmdb/node-ibm_db.git and see any change? Thanks.

bimalkjha commented 6 years ago

@huineng providing userid and passwd in configuration file is optional. You can have all other information in db2dsdriver.cfg file. For three different database, you can create three different dsn in the cfg file. Thanks.

asoj1987 commented 6 years ago

@bimalkjha - I too have the same issue of connection drop/SQL30081N and I have gone through the solutions you recommended to configure the db2dsdriver.cfg with connection details.

The issue I am having is, The way my infra/setup work doesn't allow me to edit the node modules code as it is downloaded in real time from my company's internal repo during deployment. So I am unable to take the changes in db2dsdriver.cfg to Prod.

Is there any way I can write this auto connect logic in my app js file where I have all the code for connection pooling.

See below.

const ibmdb = require("ibm_db"); const db2config = require("../config/db2Config.js"); const connString = db2config.db2SPConfig.dB2ConnectionString; ibmdb.debug(true);

let Pool = require("ibm_db").Pool, pool = new Pool(), cString = connString;

let ret = pool.init(20, cString); if (ret != true) { return false; } pool.setMaxPoolSize(50);

exports.callDB2 = async (query, SPRequest) => { try { pool.open(cString, function (err, conn) { if (err) { return err; } else { conn.query(query, SPRequest, function (err, data) { if (err) { console.log("Error while calling the Stored Proc"); conn.close(function () { console.log("conn closed"); }) return err; } else { console.log("DB2 Stored Proc call successful"); conn.close(function () { console.log("conn returned to pool"); }); return data; } });

        }
    });
} catch (error) {
    throw error;
}

};

bimalkjha commented 5 years ago

@asoj1987 This issue is fixed in ibm_db@2.5.2. Hence closing the issue now. Thanks.

jajuanMitchell commented 5 years ago

The fix works for all but one case. The case where you get the connection from and invoke setIsolationLevel . It gives the following error give CLI0108E Communication link failure. SQLSTATE=40003 . The code fix in ibm_db does not handle the case where you are setting the isolation level.

jajuanMitchell commented 5 years ago

Does this fix work for anyne else on ibm_db@2.5.3 ?

bimalkjha commented 5 years ago

@jajuanMitchell Yes, one of the IBM API Connect customer and IBM Strongloop team was facing similar issue and they confirmed that problem is not reproducible on ibm_db@2.5.3. Regarding your comments on other issues and isolation level scenario, we are looking into all comments. Thanks.

walfreyn commented 5 years ago

@bimalkjha regarding the issue mentioned by @jajuanMitchell , we opened issue 569 (URL below). Do you mind kindly help us to look into it? Thanks a lot. Really appreciate that.
https://github.com/ibmdb/node-ibm_db/issues/569