tediousjs / tedious

Node TDS module for connecting to SQL Server databases.
http://tediousjs.github.io/tedious/
MIT License
1.57k stars 436 forks source link

Intermittent lost connection to Azure SQL #300

Open tysjiang opened 9 years ago

tysjiang commented 9 years ago

I've been getting this error intermittently with tedious:

Fri Aug 14 2015 05:29:16 GMT+0000 (Coordinated Universal Time): Unaught exception: ConnectionError: Connection lost - read ECONNRESET at Connection.socketError (D:\home\site\wwwroot\node_modules\tedious\lib\connection.js:797:26) at Socket. (D:\home\site\wwwroot\node_modules\tedious\lib\connection.js:33:15) at Socket.emit (events.js:117:20) at net.js:441:14 at process._tickCallback (node.js:442:13)

The error stalls the entire node server and renders it unresponsive for around 10-15 minutes. Afterwards, the server recovers by itself automatically.

I'm on the latest tedious 1.12.2 and using it with the tedious-connection-pool module 0.3.8 (although the issue is present prior to tedious 1.12.2 and has been happening in all the versions for the past few months)

What is interesting is that it seems like the issue happens much more frequently these days than before.

I'm not sure if the above error message is helpful or not but if it's not, please inform me what other debug or log information I can get you good folks. I've been testing out tedious and besides this particular showstopper issue for me, everything looks great.

As for servers, I'm using node 32bit v0.10.36 (although I've tested this with the latest node 12 and the issue is present there too) connecting to Azure SQL (v12 DB)

Reubend commented 8 years ago

I will do a test using the SELECT 1 query on a timer. If it ends up working, I'll post back here. I still haven't found any procedure that forces this bug to appear, so it's hard for me to test properly.

GISDev01 commented 8 years ago

We are also experiencing this issue as well. This app is a typical NodeJS/Express API that queries data out of a large AzureSQL Database using this Tedious driver. Our workaround fix for now is catching the error as mentioned above, but it would be great if this was built in to the library. I'll see if I can find time to dig into this repo and put a patch in.

Reubend commented 8 years ago

Hey everybody, apologies for the huge delay since my last post. I tested the SELECT 1 query pinger as a way of avoiding this issue, and I can confirm that it does not work for me. I've tried catching the error before without much success, but based on the post by @GISDev01 maybe I will try again with some different code.

Reubend commented 8 years ago

My error catching code didn't seem to work either. Am I catching them incorrectly? Here's my code, based on the last few suggestions:

//Ping the database constantly
setInterval(function () {
    var updateRequest = database.request();
    var statement = "SELECT 1";
    updateRequest.query(statement, function (err, recordset) {
        if (err) {
            console.log("Database failure.");
        }
    });
}, 9000);

var initTime = process.hrtime()[0];

//Catch connection reset errors and reconnect to the database
sql.on('error', function (err) {
    if (err.message == "Connection lost - read ECONNRESET") {
        console.log("Connection reset bug appeared.");
        console.log("It took " + process.hrtime()[0] - initTime);
    }
    database.close();
    database = new sql.Connection(databaseConfig, null);
});
francolaiuppa commented 8 years ago

@Reubend you can try to console.log(err) and then see the exact string you get for err.message.

Also note that you don't need to manually reconnect (tedious should do it for you) only catching the error and doing nothing (i.e console logging) prevents tedious from breaking your app. Again... it is not a solution but a hack...

Something that I haven't tested is whether the SQL query gets retried when the app reconnects :fearful:

Reubend commented 8 years ago

Thanks @francolaiuppa for the tip! I'm going to remove the reconnection code and see if that helps.

trevorfrese commented 7 years ago

@Reubend did you end up solving this?

halfred888 commented 7 years ago

Anyone has a working solution for this ?

stern-shawn commented 6 years ago

Wondering if there is any solution to this besides the previously mentioned logging hacks :)

Suraiya-Hameed commented 6 years ago

Fix for transient failure (#574) submitted by @tvrprasad is already in master. Changes should be out in next npm release.

Chris-CBD commented 6 years ago

When is this new npm release coming out? I am struggling with this issue and it looks like it's been a problem for years.

jpalo commented 6 years ago

When using "SELECT 1" ping workaround, I get intermittent "ConnectionError: Failed to connect to xyz.database.windows.net:1433 in 10000ms" errors in the ping. ECONNRESETs seem to have stopped, though.

Suraiya-Hameed commented 6 years ago

@jpalo Can you test with tedious@2.1.1? It has the fix for intermittent failure :)

Suraiya-Hameed commented 6 years ago

Fixed by #574. Closing this issue. Feel free to open it if you face this issue with version >= tedious@2.1.1.

Kannaj commented 6 years ago

we're still seeing random connection resets on our end unfortunately.

the tedious version we're using is v2.2.1 . so i dont believe the update has resolved the issue (at least for us)

We're using stored procedures to handle a majority of the SQL queries. some of these queries and INSERT statements and therefore dont return anything. in cases such as these , do we have to manually close the connection between node and sql? something like connection.close() ?

Is there any other suggestion that you can provide?

Suraiya-Hameed commented 6 years ago

@Kannaj Even if your TSQL doesn't return anything, Request will emit either done, doneInProc or doneProc event, based on SQL query. You can use those events to close the connection. And, yes, your application has to decide when to close the connection and call connection.close() manually.

Regarding the connection reset issue, can you share more details? How is connection reset related to closing the connection? Just to confirm, does this involve Azure SQL Server?

arthurschreiber commented 6 years ago

@Kannaj Hey there! πŸ‘‹

I'm really sad to hear you're facing trouble with tedious. As @v-suhame mentioned, your issue description is unfortunately not enough for us to debug this. It'd be great if you could help us help you! πŸ˜„

If anyone else in this thread can answer these questions, feel free to do so as well! πŸ‘

jtmilne commented 6 years ago

I used to get this error using Azure SQL Server, but solved it by implementing automatic retries for any requests that failed with a transient error (I keep retrying until my TTL of 1 min expires using the async library). Note: each time you call query or execute on a request a new connection is acquired from the pool so this should solve any problems with individual connections that are acting up. Here is the code I use to test for transient errors.

function isErrorTransient(err) {
    var arrayTransientErrors = ['ENOCONN', 'ENOTOPEN', 'ESOCKET', 'ENOTFOUND', 'ENETDOWN', 'EADDRNOTAVAIL', 'ETIMEOUT', 'ETIMEDOUT'];
    return (err && 'code' in err && arrayTransientErrors.indexOf(err.code) > -1);
}

It would be very nice if the library provided an option on Requests to automatically retry with a new connection when a transient error occurred (with a TTL parameter).

Suraiya-Hameed commented 6 years ago

@jtmilne Thanks for sharing the workaround πŸ‘

@jtmilne @Kannaj Can you verify if the error occur only during idle connection (in the newer version of driver)? That would be of great help, we can rule out or consider idle connection resiliency based on it.

an idle connection is the one that is active (opened, but not in the pool) but it’s not executing a command or waiting for data.

From MSDN doc for other Driver,

Connection resiliency is the principle that a broken idle connection can be reestablished, within certain constraints. If a connection to Microsoft SQL Server fails, connection resiliency allows the client to automatically attempt to reestablish the connection.

jtmilne commented 6 years ago

@v-suhame Sorry but I wrote this fix 2 years ago so I don't remember the details of the errors.

Kannaj commented 6 years ago

Thanks for your prompt response guys

@v-suhame - my thinking is that the connection is being reset by Sql because the connection still persists after the procedure is executed. heres' the structure of my SQL queries

exports.do = function(type, params, callback){
    var conn = new Connection(connectionConf);
    conn.on('connect', function(err) {
        // if a connection error occured, then fail immediately
        if (err) {
            logger("warn", "db", "cannot connect to sql db", null, {"err":err});
            callback(err);
        }
        else
        {
            var request = new Request(type, function(err, rowCount){
                if (err) {
                    callback(err);
                }
                else if (rowCount == 0) {
                    callback();
                }
            });

            // set the parameters based on the request that was supposed to be executed
            switch(type) {
                case "procName":
                    request.addParameter('email', TYPES.VarChar, params[0]);
                    conn.callProcedure(request);                                    
                    break;            
                }

            request.on('row', function(columns) {
                var returnObject = {};

                // iterate and map the object to each column - the column name is in the metadata object
                columns.forEach(function(column){
                    returnObject[column.metadata.colName] = column.value;
                });

                // return with an error if the error occured and was caught by the procedure 
                if (columns[0].metadata.colName == 'error') {
                    callback(returnObject);
                }

                // return with the result set given by the procedure
                else {
                    callback(null,returnObject);
                }
            })
        }
    });

    conn.on('error', function(err){
        logger("warn", "db", "exception in tedious", null, {"err":err});
    })
};

Am i missing something out?

@arthurschreiber

Connection resets are quite random and hard to measure unfortunately. but we see one happen every few hours. regarding number of queries sent - we have around 10-12 requests happen per session

We have only one process connected to the SQL server

Kannaj commented 6 years ago

@v-suhame , @arthurschreiber - any idea about this? We currently out of solutions. :(

Any help or advice would be great

Suraiya-Hameed commented 6 years ago

@Kannaj Sorry for the delayed response. Is there any reason for not closing the connection after Request completes? If you're not reusing the connection, it is best to close it after use to release resources both in Server and Client side. If you want the connection to stay open, maybe you can try https://github.com/tediousjs/tedious/issues/300#issuecomment-355672780 as temporary solution, till we get to the root of it.

Is that an on-premise SQL Server or Azure SQL Server? Do share the SQL Server logs when you get them.

mattglover11 commented 5 years ago

I am getting this issue also using the latest tedious, v2 of the function app runtime on Azure and MS SQL server. I'm thinking the only solution might be to write a sync loop to try up to 5 times to connect before crashing. Anyone else found a solution?

marcindabrowski commented 5 years ago

I also got this error just after starting bulk load. And the cause was that I tried to insert text which had more characters than column allowed. When I altered column it started to work.

murffious commented 5 years ago

I go this error when using the AzureDB example code with Tedious and Node to make requests etc

wy193777 commented 5 years ago

Got same error when using newest tedious version with typeorm. It works half years ago, not sure why. I didn't do any bulk load, only some simple query that test if a table already existed.

wy193777 commented 5 years ago

Actually it's not intermittent. A very stable behavior.

MichaelSun90 commented 4 years ago

Hi @wy193777, Can you provide a reproduce script with the typeorm? I want to try it on my side, see if I got the same behavior. Thanks!

dfenerski commented 4 years ago

So... how do you close a connection? :D and why do connections 'crash' and go to sleep? no clear fix found.

cjorge commented 3 years ago

There can be two reasons for this issue to happen, both most likely linked to cloud based SQL.

The most likely reason is to implement code for a persistent connection in your database connection code. Probably something similar to:

if (connPoolPromise) return connPoolPromise;

You persist the promise but the database, if the connection (by any reason) has been rejected, cannot connect it will proceed to try to make the request using a non existent connection and throw an error Uncaught exception: ConnectionError: Connection lost - read ECONNRESET. This error would be thrown on your REQUEST and not on your CONNECTION. I'm assuming most of the cases will be due to this.

You could have transient errors on connection but this would be very unlikely... Still, a retry approach would be wise regardless. You can try this locally by trying to connect to the database by turning your wifi off and on... Ideally when you can't connect you should retry a couple of times with a few seconds interval. Something similar to this should so the trick:

async function sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
}

let connPooPromise = null;
let retries = 0;
while(retries<10){
        connPoolPromise = await CreatePool();
        if (connPoolPromise!==null) break;
        retries++;
        await sleep(2000);
    }
return connPoolPromise;