vapor / async-kit

Sugary extensions for the SwiftNIO library
MIT License
71 stars 25 forks source link

thread safe connection pool #40

Closed tanner0101 closed 5 years ago

tanner0101 commented 5 years ago

@MrMage I've moved to coarser grain locking and performance seems to be unaffected 👍

Screen Shot 2019-10-17 at 4 01 50 PM

The latest push also includes the "event loop preference" mechanism I mentioned.

tanner0101 commented 5 years ago

@MrMage thanks again for the great reviews

FYI, using popLast here will cause the first element in the event loop to hardly be used when the pool is sufficiently large.

Good point. I've changed the available connections storage to be a CircularBuffer. That means we can popFirst() and append at O(1).

More generally: What if a connection gets closed (e.g. by the server) while it is not in use (i.e. while it is inside available)?

requestConnection will always attempt to replace closed available connections once before throwing.

Your idea might actually be a better algorithm though. With the current algorithm, activeConnections will never decrease. So that means if you get a lot of load once, your pool will stay at maxConnections forever after.

I think this would make sense as a separate issue though: https://github.com/vapor/async-kit/issues/41

It sounds like this could still return connections from a different event loop if one is available. I guess that can hardly be avoided, but I would recommend to try having a separate connection pool per event loop container when possible.

For DB connections, I agree. Ensuring there are connections available for each event loop and only using those makes sense since they are in such high demand. However, for something like an HTTPClient, it might be faster to use a connection on a different event loop than wait for a new connection to be made.

I want this connection pool to be generally applicable to both of those use cases and to let the end user choose their behavior. For example, imagine a user has a very high logical core count 16-core processor w/ hyperthreading = 32 event loops. Now imagine for some reason they are connecting to a DB that allows 8 connections open at once. W/ the current connection pool design (on master) where each event loop must have its own pool, this use case is blocked.

That's why I think the event loop preference is a good compromise:

public enum ConnectionPoolEventLoopPreference {
    // any event loop is fine for connection and return
    case indifferent
    // the future return must be on the supplied event loop (and preferably the connection, too)
    case delegate(on: EventLoop)
    // the connection and future return must be on the supplied event loop
    case connection(on: EventLoop)
}

This API is more expressive of what the user wants / tolerates which gives the connection pool more flexibility to grow and optimize its performance going forward.

For instance, you could add a configuration option in the future to ensure that DB connections are split equally across available event loops. Something like:

enum ConnectionPoolEventLoopDistributionStrategy {
    // any distribution of connections across event loops is fine
    case indifferent
    // connections must be split evenly across event loops
    // max connections must be a multiple of the total event loop count
    case evenly
}
tanner0101 commented 5 years ago

Thinking about this more, I think that for pools that require a connection operating on the same event loop, it would be easier to just keep the pattern of providing one pool dedicated to each event loop. That way, case connection(on: EventLoop) (which would complicate things quite a bit) could be dropped altogether.

That makes sense to me. I'm also not sure we really need connection(on: EventLoop). I'm just happy it's a possibility in case we need it in the future.