MasoniteFramework / orm

Masonite ORM is a beautiful Python ORM. It's also a nearly drop in replacement of the Orator ORM
https://orm.masoniteproject.com
MIT License
163 stars 48 forks source link

Add connection pooling #483

Open josephmancuso opened 3 years ago

josephmancuso commented 3 years ago

Right now, Masonite ORM makes a connection and drops a connection on every query call. This is fine in most instances but if you have several application making the same connections to the same database, you'll want to set a connection_pool_limit. Like 100 connections. Then we can pull from the connection pool rather than making a new connection every time

josephmancuso commented 3 years ago

we're also going to need some kind of timer delay for waiting for a connection to be open

federicoparroni commented 2 years ago

Hello @josephmancuso! Another enhancement I can suggest is to make Masonite thread-safe. If I open a transaction in a thread and another thread is doing queries, I get InterfaceError from pymysql. Do you have a workaround at the moment? I can see that ConnectionResolver is storing the open connections as a class field indexed by connection name: https://github.com/MasoniteFramework/orm/blob/8d907a7cc534537019016ba6426ce3adf9543e63/src/masoniteorm/connections/ConnectionResolver.py#L4-L8 Is there any way to segregate the open connections in order that each thread can access only its own connections?

josephmancuso commented 2 years ago

If you can give some code examples I can test it out and figure out the issue. The connection resolver just holds the available connection classes. I don't think that should be a thread issue but I could be wrong

federicoparroni commented 2 years ago

I will try to provide a minimal example as soon as possible since I have a quite complex application. Looking at the code, I think that the issue is the shared _connections field. When a transaction is opened, the connection is stored in that dictionary (line 60): https://github.com/MasoniteFramework/orm/blob/8d907a7cc534537019016ba6426ce3adf9543e63/src/masoniteorm/connections/ConnectionResolver.py#L47-L62 Then, the connection is reused when a connection is required (line 68-69): https://github.com/MasoniteFramework/orm/blob/8d907a7cc534537019016ba6426ce3adf9543e63/src/masoniteorm/connections/MySQLConnection.py#L45-L83 In summary, I suppose that connections are shared between different threads...

josephmancuso commented 2 years ago

Hmm if you are using transactions than that's possible it's trying to access a connection in another thread

circulon commented 2 years ago

Hmm if you are using transactions than that's possible it's trying to access a connection in another thread

Would this effect be due to every call to the db when creating a model or querybuilder instance or just getting DB (connectionresolver with config) for Transaction use generates a new instance of the ConnectionResolver?

Shouldn't the ConnectionResolver be cached in some way to utilise the same connection during multiple consecutive queries?

Or am I just looking at this wrong?

josephmancuso commented 2 years ago

So theres a few definitions that need to be defined. Theres a few different "connection" references here.

  1. Connection Resolver. This is kind of a god object in the library responsible as a central store that parts of the ORM can use to store and retrieve things.

  2. Connection Classes - These are the classes with the logic responsible for calling the specific database. These are standalone classes that get registered to the connection resolver. These are instantiated on every new connection to the database.

  3. Connections - These would be the actual connections that exist once a connection is made to the database. For example, you go to make a query to a MYSQL database and it opens a connection for the duration of the query and then closes it after.

With queries we open and close a connection on every single query. I would assume this would be thread safe since the thread doing the query also should be the one creating and closing the connection class (3.).

With transactions the current implementation is to store the "connections" (the 3. here) and then save the connection globally so we can keep the transaction open for the duration of the transaction and then once the transaction is committed or rolled back we close this open transaction from the connection.

I think what @federicoparroni is suggesting is that doing that can result in 2 threads sharing the same "connection" class (3.) and then one thread closing the connection while another thread still is using it

federicoparroni commented 2 years ago

Yes @josephmancuso, it's exactly that situation. My current workaround for this issue is to create ad-hoc connections with unique names for each thread, so that when a transaction begins, it is only used by that single thread. I'm using the on method to force all the queries in the transaction context to use that unique connection, but I am also experiencing some problems with this approach (see https://github.com/MasoniteFramework/orm/issues/726)

josephmancuso commented 2 years ago

Yes @josephmancuso, it's exactly that situation. My current workaround for this issue is to create ad-hoc connections with unique names for each thread, so that when a transaction begins, it is only used by that single thread. I'm using the on method to force all the queries in the transaction context to use that unique connection, but I am also experiencing some problems with this approach (see #726)

I think the solution is to probably use SAVEPOINTS instead of using the cursor on the connection class but i'll have to research that a bit.

stoutput commented 1 year ago

sqlalchemy uses something called a "session factory" which creates a new sqla session if one does not exist, or returns the current session if it does. Session instantiation involves obtaining a new connection. This session object provides thread-safe property access to things like the current transaction and its db connection. Something akin to that could be of use here – but I think thread safety should probably be another issue altogether though it's important to think about in this context.