mperham / connection_pool

Generic connection pooling for Ruby
MIT License
1.63k stars 143 forks source link

Support different types of connection in a pool #53

Closed drbrain closed 10 years ago

drbrain commented 10 years ago

This allows connection_pool to manage heterogenous connections in a pool. Connections may be of different classes. Connections are distinguished by hash equality on a description object that is used as parameters for creation of a connection. Old connections waiting in the pool are reclaimed if connections of a different type are needed.

The LRU for connection reclamation is ruby 1.9+ only. I can't tell if connection_pool is supposed to support ruby 1.8 or not. While the specification doesn't indicate, the tests use require_relative which is 1.9+.

I have not updated the documentation as I wish to await approval of my API changes.

If connection_pool supports only 1.9+ it can use keyword arguments to retrieve the timeout option (instead of the options hash). This would remove the clumsy argument swapping.

mperham commented 10 years ago

It's 1.9+ only

On Feb 21, 2014, at 16:18, Eric Hodel notifications@github.com wrote:

This allows connection_pool to manage heterogenous connections in a pool. Connections may be of different classes. Connections are distinguished by hash equality on a description object that is used as parameters for creation of a connection. Old connections waiting in the pool are reclaimed if connections of a different type are needed.

The LRU for connection reclamation is ruby 1.9+ only. I can't tell if connection_pool is supposed to support ruby 1.8 or not. While the specification doesn't indicate, the tests use require_relative which is 1.9+.

I have not updated the documentation as I wish to await approval of my API changes.

If connection_pool supports only 1.9+ it can use keyword arguments to retrieve the timeout option (instead of the options hash). This would remove the clumsy argument swapping.

You can merge this Pull Request by running

git pull https://github.com/drbrain/connection_pool different_types Or view, comment on, or merge it at:

https://github.com/mperham/connection_pool/pull/53

Commit Summary

ConnectionPool::TimedStack is now lazy Allow multiple connection types in TimedStack Reclaim old connections in TimedStack Add connection_args support to ConnectionPool Support connection_args in Wrapper#with File Changes

M lib/connection_pool.rb (24) M lib/connection_pool/timed_stack.rb (55) M test/test_connection_pool.rb (62) M test/test_connection_pool_timed_stack.rb (86) Patch Links:

https://github.com/mperham/connection_pool/pull/53.patch https://github.com/mperham/connection_pool/pull/53.diff — Reply to this email directly or view it on GitHub.

mperham commented 10 years ago

Why store different types within the same pool? A pool is a limited collection of identical things to be shared by a larger number of users. I'm not convinced this is a good direction.

Could you instead compose a higher-level pool based on connection_pool primitives or use a subclass?

drbrain commented 10 years ago

For background, here is a simplification of my intended use in net-http-persistent (nhp hereafter):

class Net::HTTP::Persistent
  def initialize
      @pool = ConnectionPool.new do |host, port|
        Net::HTTP.new host, port
      end
    end
  end

  def get uri
    @pool.with uri.hostname, uri.port do |http|
      http.get uri
    end
  end
end 

There are two general ways to use the above code:

For the first I could use connection_pool without changes, but there is a problem. If nhp is used inside an API wrapper library which is then used by someone who wants to interact with several web APIs they may need to think about how many file descriptors they're using (at least, without #52, lazy connections).

For the second it becomes much more difficult for me without some patch to connection_pool (this occurs especially inside mechanize). With connection_pool as-is it's not possible for me to make a reasonable guess at how many connections a user wants to a particular HTTP server. #52 will help with this, but I still need a way to reclaim a connection across all my pools when I meet my file descriptor limit.

For my purposes, either #52 plus #53 or #52 plus a separate pull request to allow connection reclamation would work. I think allowing users of connection_pool to kill connections would be equally invasive but less functional overall. I think it makes more sense to allow connection_pool to manage large groups of connections that are related, but that don't necessarily connect to the same place.

If combined with a future pull request that would reclaim long-unused connections (high/low water marks, I think) connection_pool could be used in a load-balancing manner. Instead of connecting to a single server a user would be able to choose a set of DB read-only slaves, memcache servers, etc.

mperham commented 10 years ago

I'm happy to pull in #52. Several people have asked and I've had to work around the eager loading on occasion.

What do you think about composing a higher-level pool based on connection_pool primitives or using a subclass? Maybe a VariableConnectionPool which allows pooling heterogeneous objects which vary by one or more dimensions, with configurable limits per variant and per total count. I'm trying to avoid injecting too much complexity into one class, when this is not a common requirement. You are the first person to ask for this type of configuration so I'm leery to just pull it in as is. I'd rather see nhp subclass connection_pool with its own more complex impl.

drbrain commented 10 years ago

A separate class is fine with me.

Would you prefer me committing the TimedStack changes along with a new class or just the TimedStack changes for use elsewhere?

How about ConnectionPool::Multi? This will keep everything in the same namespace.

djanowski commented 10 years ago

Can this live in a separate gem?

It looks like you'd be making NHP 'simpler' by making connection_pool more complex without much gain for most of its users.

mperham commented 10 years ago

@drbrain I don't think we should pull in the TimedStack changes.

drbrain commented 10 years ago

That's fair, but NHP will need to grow complexity too either way. I figured I would try contributing to connection_pool first as it already has almost all the parts I need.

Right now it is just a single-connection pool. Making it a multi-connection pool is a matter of 50 lines or so of core functionality (TimedStack) combined with the API to wrap it.

drbrain commented 10 years ago

My previous comment ↑ was in response to @djanowski. Let me poke around with my changes some more to see if there's a way to add the same feature via subclass. If they're still unacceptable then perhaps a separate gem (maybe connection_pool-multi) will be a way forward.

drbrain commented 10 years ago

How about #55 instead of this pull request?

With #55 I can implement my own multi-connection pool. (See #55 for an example)