mariadb-corporation / mariadb-connector-nodejs

MariaDB Connector/Node.js is used to connect applications developed on Node.js to MariaDB and MySQL databases. MariaDB Connector/Node.js is LGPL licensed.
GNU Lesser General Public License v2.1
368 stars 91 forks source link

Is createPool actually async? #180

Open whitebyte opened 2 years ago

whitebyte commented 2 years ago

Consider the following code:

        try {
            this.pool = mariadb.createPool(<any>{
                host: this.dbConfig.dbHost,
                port: this.dbConfig.dbPort,
                user: this.dbConfig.dbUser,
                password: this.dbConfig.dbPassword,
                database: this.dbConfig.dbName,
            });

            await this.pool.query('SELECT 1 FROM DUAL', []);
        } catch (e) {
            logger.error('DB init error');
        }

If the DB is unavailable (e.g. because of a network failure) I expect to see DB init error in console. In practice, I see it AND something like this:

(node:215728) UnhandledPromiseRejectionWarning: Error: retrieve connection from pool timeout after 10001ms
    at Object.module.exports.createError (/home/a/projects/wah-api/node_modules/mariadb/lib/misc/errors.js:61:10)
    at timeoutTask (/home/a/projects/wah-api/node_modules/mariadb/lib/pool-base.js:319:16)
    at Timeout.rejectAndResetTimeout [as _onTimeout] (/home/a/projects/wah-api/node_modules/mariadb/lib/pool-base.js:342:5)
    at listOnTimeout (internal/timers.js:556:17)
    at processTimers (internal/timers.js:497:7)
(Use `node --trace-warnings ...` to show where the warning was created)
(node:215728) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting
 a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/a
pi/cli.html#cli_unhandled_rejections_mode). (rejection id: 28)
(node:215728) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with
a non-zero exit code.
Error: connect ECONNREFUSED 127.0.0.1:3335
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1146:16)
 From event:
    at _registerHandshakeCmd (/home/a/projects/wah-api/node_modules/mariadb/lib/connection.js:745:11)
    at /home/a/projects/wah-api/node_modules/mariadb/lib/connection.js:57:11
    at new Promise (<anonymous>)
    at Connection.connect (/home/a/projects/wah-api/node_modules/mariadb/lib/connection.js:56:16)
    at createConnectionPoolPromise (/home/a/projects/wah-api/node_modules/mariadb/lib/pool-promise.js:31:8)
    at creationTryout (/home/a/projects/wah-api/node_modules/mariadb/lib/pool-base.js:373:9)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 3335,
  fatal: true
}

I took a brief look at the connector code and it seems createPool uses https://github.com/mariadb-corporation/mariadb-connector-nodejs/blob/bc9f6c73bb9f1ffd3306f31bb061965b67b7f447/lib/pool-base.js#L138, which doesn't look asynchronous for me

rusher commented 2 years ago

this is the second time there is an issue with typescript (https://jira.mariadb.org/browse/CONJS-171) that cannot normally occur. I cannot explain this for now, i need to dig into this. Could you just indicate TypeScript version and target version ?

whitebyte commented 2 years ago

"typescript": "^4.1.3" "target": "es2019"

The issue is reproducible with these versions, but I don't believe it's related to a specific TS version

whitebyte commented 2 years ago

I actually dug into this a bit. This is what mariadb.createPool callstack looks like:

mariadb.createPool
    new PoolPromise <- not a promise, despite the name. Nothing async happens there
        .initialize -< inherited from PoolBase

Everything is called synchronously. Now, this is what we have in PoolBase.initialize:

  this.initialize = function () {
   // ...snip
    connectionCreationLoop(self, 0, timeoutEnd)
      .then((conn) => {
        ...

So connectionCreationLoop returns Promise, which in turn calls createConnectionPool that also returns Promise, but we never await for them. So mariadb.createPool doesn't return a pool, it returns Promise<pool>, but since it doesn't annotated as Promise, we can't await for it. I'm not sure this is the reason for this specific bug, but it doesn't look good anyway

TomMiller-mas commented 2 years ago

Newbie here, assuming you do get this fixed, can you also do a TS example?
I have tried to convert a hundred of these JS examples to proper TS using mariadb with no luck. I continually get an error on let pool; as an and getPool() not being a function. I am now wondering if this bug is contributing to these bugs I am getting (or I just suck at TS).

let pool: maraidb.Pool;

This doesn't error out, but it doesn't seem to solve the problem either. Desperate for some help.

Thanks!

// db.js
var mysql = require('mysql');
var pool;
module.exports = {
    getPool: function () {
      if (pool) return pool;
      pool = mysql.createPool(dbConfig);
      return pool;
    }
};

// app.js
var pool = require('pool').getPool();
pool.query('SELECT * FROM users');
whitebyte commented 2 years ago

var pool = require('pool').getPool(); pool is actually not exported, so require('pool') doesn't make a lot of sense to me. Anyway, it's a separate issue that has nothing to do with createPool being async or not.

TomMiller-mas commented 2 years ago

This was just an example. It is JavaScript and MySQL2 driver. I am just trying to duplicate the intent of the code in TypeScript with MariaDB. I will open a new ticket. Like I said, I am new and having a hard time with getting the syntax right.

ekr3peeK commented 2 years ago

@whitebyte managed to resolve this? trying the same thing, and it is really annoying, that the createPool function, despite it being not asynchronous, calls an asynchronous function which can fail - now, the error could be possibly catched, but the success response is much harder this way.

whitebyte commented 2 years ago

managed to resolve this?

Unfortunately no

RevealedFrom commented 2 years ago

Would updating the Type definition for createPool to Promise<pool> solve the problem?

A script with just these two lines will not terminate:

import mariadb from "mariadb";
const pool = mariadb.createPool({ host: "localhost", database: "x", user: "x", password: "xxx"});
rusher commented 2 years ago

To be complete: the pool is not asynchronous => the pool object is immediately returned, but therefore the loading of connections into the pool is asynchronous.

It was originally done this way for compatibility with mysql2 /mysql (with promise-mysql). now mysql2 still uses the same format when promise-mysql now returns a promise since 5.0.

There are 2 possible format for mariadb.createPool :

At first I thought it was a bad design and would prefer the Promise solution, but I changed my mind: what if there was a network problem just when the server started? then promise to return an error and you have to restart the app for the pool to work?! Since the latest version, the connector will let you know if it fails to create a connection, but the pool will work immediately after fixing this problem without any application restart.

Don't hesitate to express your opinion! Another solution could also be to offer both solutions:

rusher commented 2 years ago

for @RevealedFrom

import mariadb from "mariadb";
const pool = mariadb.createPool({ host: "localhost", database: "x", user: "x", password: "xxx"});

this is expected: pool is active. I thing node.js doesn't end when there is an active timer, and pool set an interval for validating existing connection. You usually use pool for server, so you usually don't want to close pool, but if that fit your purpose pool.end() will close pool, and your script will end.

whitebyte commented 2 years ago

return Pool object, and if the connection fails, the connection returns an appropriate connection error to help identify the problem

Why would one want a pool that can't provide a connection? IMO a pool without at least one successfully established connection has no reason to exist. Also it makes a false impression that the DB is connected and operational, which is not true in this case

gnat commented 2 years ago

@rusher Please open the issue tracker over at https://github.com/mariadb-corporation/mariadb-connector-python ; Hosting issues on Jira is a high friction way to ruin adoption; you're just going to get one-off accounts asking one-off questions, with no real collaboration or discussion.

Tangentially related, coming from the Python library, also concerned about this as there's no usage of asyncio anywhere in the connector. Really unsure if I can use the official libraries because I fear it's gonna block the event loop. :eyes:

Doesn't help the Django adoption story because Django is moving to full async for 5.0.. and all the other popular ASGI frameworks (FastAPI, Starlette, Sanic)

rusher commented 2 years ago

@gnat Question are usually ask and answer on stackoverflow, so i don't think that's a big deal. I've not the priviledge to do that.

What you indicate concerns Python driver and is clearly some new feature, and i won't be the right guy for that. So please describe you problem here : https://jira.mariadb.org/projects/CONPY/issues/CONPY-197?filter=allopenissues . Python database spec are synchronous. I'm not completly sure but i think green thread are implemented in python. If that's the case, using synchronous spec + green thread would probably be easier and better than an async implementation. anyway please create your idea and problem.

gnat commented 2 years ago

As much as I also find it interesting to think what the python ecosystem would look like to have settled on green threads, the built-in asyncio is the standard now, and is what is being used by all of the ASGI frameworks, and upcoming versions of Django.

MariaDB support in Django, etc. is just going to get stuck behind an async MySQL driver if the team is going to keep this kind of friction, because no new project is going to commit to using a blocking connector when the whole rest of the stack is async.

TomMiller-mas commented 2 years ago

I don't see any problem with the creation of the pool being synchronise. But everything about the connections has to be asynchronise. The pool should get created and destroyed once. No big deal. But the connection will be gotten and released thousands of times.

jamiehodge commented 1 year ago

It appears as though it is necessary to add an on error listener to pools and decide whether or not to terminate the process based on the fatal property?! Like https://github.com/mariadb-corporation/mariadb-connector-nodejs/issues/200, I struggled to understand why uncaughtException errors were being triggered in apparently otherwise healthy processes (with multiple pools). Adding pool.on("error", (error) => { ... }) gave some semblance of control. It would be great to have something approaching a best practice document (or paragraph) around this. I understand that a pool creates and discards many connections during its lifecycle, but to what extent is it self-healing and when is it time to exit the process?

These errors turned out to be the result of host misconfiguration.

whitebyte commented 1 year ago

Is there any update on this? I wouldn't mind working on it by myself and opening a PR, but what's the consensus? @rusher

TomMiller-mas commented 1 year ago

There are other problems in pool bigger than this and I don't see the benefit of converting the create process of the pool that happens one time at startup to async. There is no way to force a reset of the connection as you are returning it to the pool and the prepare object can linger. I already have a ticket in for the prepare object, but would love to have a .release(true) to force a hard reset of the connection as it is being returned to the pool.

PeppeL-G commented 1 year ago

Why would one want a pool that can't provide a connection? IMO a pool without at least one successfully established connection has no resaon to exist.

@whitebyte Because the database might not be ready to accept connections when the app starts, but it will be ready to accept connections later (within a few seconds). This is a common problem when using Docker: the web app starts immediately, but the database needs longer time to start, so the web app can't establish an connection to it immediately.

Also, that you manage to connect to the database when the pool is created is no guarantee that you will be able to connect to it later (maybe the DB crashes after 1 minute), so you always need to deal with DB connection errors every time you use it anyway, so your reasoning a pool without at least one successfully established connection has no resaon to exist I simply disagree with; I don't want my web app to crash just because the database has crashed; my web app should continue to function on its own (i.e. show internal server error message), and when the database has been fixed and restarted, the pool should automatically established a new connection to it the next time the web app requests a connection from it.

So, in my opinion, a pool with no functional connections is useful. It's also much easier to implement and learn how to use the package if the pool is just a pool, and doesn't try use connections on its own. And if you need to test if the connection to the DB works upon pool creation, you can still very easily do that yourself by obtaining a connection from the pool after pool creation.

Also it makes a false impression that the DB is connected and operational, which is not true in this case

That is only true if you have the belief that the pool is responsible for connecting to the database upon creation. Where have you got that belief from? And, you can also look at this the other way: by giving an error when the pool can't connect upon creation, that can give the false impression that at least one connection in the pool will always work if you don't get the error (and that is wrong, since the database can crash on its own after a while).

whitebyte commented 1 year ago

@PeppeL-G I actually don't have a strong opinion on this one. My point is that the current behavior is misleading. Either sync or async, but pick one. Currently createPool is annotated as sync but is async under the hood. This is the reason I opened this issue in the first place

PeppeL-G commented 1 year ago

Boy, am I stupid or what. I came here looking for clues to a weird behavior I'm experiencing, and I got stuck on your question Why would one want a pool that can't provide a connection?, but now that I read your first post, I realize you're post is precisely about the same weird behavior I'm experiencing xD

In short, the problem can be summarized as this (see the comments, the code below is all code in the program):

const mariadb = require('mariadb')

// 1. I have no Maria DB database up and running, but the
// pool is created without problem, as expected.
const pool = mariadb.createPool({
    host: "localhost",
    port: 3306,
    user: "root",
    password: "abc123",
    database: "abc",
})

// 2. The program continues to run.
  1. After some seconds, the program crashes with the following error:
node:internal/process/promises:279
            triggerUncaughtException(err, true /* fromPromise */);
            ^

Error: connect ECONNREFUSED 127.0.0.1:3306
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1187:16)
 From event:
    at /Users/peter/mariadb-crashes-with-no-connection/node_modules/mariadb/lib/connection.js:133:13
    at new Promise (<anonymous>)
    at Connection.connect (/Users/peter/mariadb-crashes-with-no-connection/node_modules/mariadb/lib/connection.js:121:12)
    at Pool._createConnection (/Users/peter/mariadb-crashes-with-no-connection/node_modules/mariadb/lib/pool.js:402:16)
    at Pool._doCreateConnection (/Users/peter/mariadb-crashes-with-no-connection/node_modules/mariadb/lib/pool.js:40:10)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)
Emitted 'error' event on PoolPromise instance at:
    at Pool.emit (node:events:527:28)
    at /Users/peter/mariadb-crashes-with-no-connection/node_modules/mariadb/lib/pool.js:258:22
    at processTicksAndRejections (node:internal/process/task_queues:96:5) {
  errno: -61,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 3306,
  fatal: true,
  sqlState: 'HY000'
}

I'm guessing the Promise version of the code was written before Node.js version 15, when unhandled promise rejections were simply silently ignored, but from version 15 and on they instead stop the process, so we can become aware of that something is wrong.

So, to summarize (IMO):

  1. createPool() should remain being synchronous (i.e. immediately return back the pool), as it is now
  2. createPool() may do some asynchronous work if it wants (for example to establish connections upon creation, so it has one available as soon as someone requests one from it)
  3. But if that asynchronous work fails, it must handle its own errors, instead of crashing the app through an uncaught exception

So a fix to (3) in the list above would make me happy :)

rusher commented 1 year ago

ouch, that wasn't clear for me either. this concerns 3.x only, and due to :

If an EventEmitter does not have at least one listener registered for the 'error' event, and an 'error' event is emitted, the error is thrown, a stack trace is printed, and the Node.js process exits.

As a workaround, you can just add :

pool.on('error', () => {});

this will be corrected in 3.1.1

deguich commented 1 year ago

Hi, an update on index.d.ts to add

 on(ev: 'error', callback: (err: Error) => void): Pool;

Could be good to let the typescript developer catch error too. Thx

akshaysasidrn commented 6 months ago

Hello folks, is this issue already resolved in the latest (3.2.3)?

Earlier any error from the pool was thrown async and not handled within the client code. After the upgrade, it no longer throws an error and seems to be captured and logged to stdout. Which I assume is handled here? If yes, just curious to know why there can be multiple errors when trying to establish connection such that they are to be captured and handled only internally within the client?

For example, if I give an incorrect port to establish a connection I first get an error which I can handle:

(conn=-1, no: 45028, SQLState: HY000) retrieve connection from pool timeout after 10001ms\n    (pool connections: active=0 idle=0 limit=5)

And then after some time, I get this logged separately, which I'm not handling (in the earlier version this used to crash the node server):

SqlError: (conn=-1, no: 45012, SQLState: 08S01) Connection timeout: failed to create socket after 1002ms
    at Object.module.exports.createFatalError (/Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/misc/errors.js:88:10)
    at Connection.connectTimeoutReached (/Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/connection.js:1033:24)
    at listOnTimeout (node:internal/timers:571:11)
    at processTimers (node:internal/timers:512:7)
 From event:
    at /Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/connection.js:141:13
    at new Promise (<anonymous>)
    at Connection.connect (/Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/connection.js:130:12)
    at Pool._createConnection (/Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/pool.js:406:16)
    at Pool._doCreateConnection (/Users/akshay/code/ToolJet/plugins/node_modules/mariadb/lib/pool.js:42:10)
    at listOnTimeout (node:internal/timers:569:17)
    at processTimers (node:internal/timers:512:7) {
  sqlMessage: 'Connection timeout: failed to create socket after 1002ms',
  sql: null,
  fatal: true,
  errno: 45012,
  sqlState: '08S01',
  code: 'ER_CONNECTION_TIMEOUT'
}