vibur / vibur-object-pool

Vibur Object Pool - general-purpose concurrent Java object pool
http://www.vibur.org/vibur-object-pool/
Apache License 2.0
71 stars 16 forks source link

About objects housekeeping/maintenance (keep alive) #5

Closed agseco closed 6 years ago

agseco commented 6 years ago

Hi @simeonmalchev,

Firstly, congratulations for your objects pool implementation and thanks for publishing it; it's by far the cleanest & simplest I have found (including Apache Commons').

Here at Caravelo we are trying to use it to implement a connections pool to an Airlines Reservation System.

One of our requirements is that the idle connections must be kept alive before they expire (since they are expensive to create). Pardon me if I'm too naive, but do so I'm thinking of some sort of housekeeping scheduled task which does the following:

  1. drain all the idle connections
  2. send a keep alive request to the reservation system for each connection
  3. return the connections to the pool

I'm aware that Vibur Object Pool supports items validation through the PoolObjectFactory, however I'm afraid this would not be enough to meet the requirement I have just described. Can you provide some advice in this regard?

If it's not possible to do with Vibur Object Pool as of today, I'm thinking of forking the repository, adding housekeeping support (ideally following your indications/specs) and then send you a pull request.

I'll be awaiting your feedback.

Cheers.

simeonmalchev commented 6 years ago

@agseco, thanks for contacting me with regards to vibur-object-pool. Before we continue on your question, can I just make sure that you have seen https://github.com/vibur/vibur-dbcp which is the connection pool that is built on top of vibur-object-pool? The docs are here: http://www.vibur.org .

Is it possible that vibur-dbcp will provide to you all the functionality that you need? Vibur-dbcp has scheduling functionality for draining (or reducing) the idle connections, the connections can be validated before use if needed (depending on how long time they have stayed in the pool), keep-alive can be easily implemented, and it obviously it has all the borrow/return functionality of connections to the pool.

Or do you have some other special requirements that are preventing you from using vibur-dbcp?

Cheers, Simeon

agseco commented 6 years ago

Hi again @simeonmalchev,

Thanks for your prompt reply :)

Indeed I'm aware of the existence of Vibur DBCP, however I cannot use it since, as far as I have read it's limited to JDBC connections. I think I should have stated it before, but the objects pool I'm trying to implement is not a pool of database connections, but sessions in a SOAP web service.

Since Vibur DBCP uses Vibur Object Pool as main building block, I'll deep dive in the code of Vibur DBCP to find out how it achieves keep alive. Also I'll explore if I can benefit of Vibur Object Pool's SamplingPoolReducerin some way.

Your feedback will be much appreciated.

Cheers.

simeonmalchev commented 6 years ago

@agseco, yes, I got it now that you're trying to implement a SOAP web service sessions pool.

You can probably base your pool on vibur-object-pool similarly to how vibur-dbcp is based on vibur-object-pool. Your pool will obviously need to return a stateful-proxy objects, where the state will consists as a minimum of the open/closed property of the proxy object. Then when the close() method on the proxy object is called this will return the underlying proxied object back to the object-pool; this is how vibur-dbcp works, I'm sure you've seen this part already.

As to the draining/reducing of the idle connections in the underlying object-pool as well as to implementing keep-alive (or may be just validating a taken from the pool connection before use, assuming it has stayed in the pool for longer than some threshold time), all this functionality is best to be implemented in the session-pool that you're implementing, not in the object-pool itself. For example, see in the vibur-dbcp sources the PoolReducer class that extends SamplingPoolReducer. SamplingPoolReducer is just an utility class that starts internally one service thread that periodically (between sleeps) checks various stats of the object-pool and then periodically shrinks the pool size. Keep-alive can be implemented in a similar manner. Vibur-dbcp doesn't implemented keep-alive, it does however validate the taken from the pool connections if it has been configured to do so. Generally speaking, keep in mind that each keep-alive/validation of a connection will cause a round-trip to the remote service, so you probably wish to not do this very often.

Feel free to reuse any parts of vibur-dbcp sources that you may find useful for your pool. The overall structure of such pools will be similar but the implementation details will obviously be different. If you're finding the current source code base of vibur-dbcp too large, you can checkout some old tag, say 1.0.3, this will give you a shorter source code base to deal with - this is, of course, up to you.

Simeon.

agseco commented 6 years ago

Hi @simeonmalchev,

Many thanks for your reply.

My aim is to avoid modifications to the pool, if possible. In this line I think I'll do a scheduled task, similar to SamplingPoolReducer which:

  1. Tries to take all the objects available and save it to a temporary collection - a drawback is that it may force the early creation of all the objects until the max is reached; this is not ideal but acceptable
  2. Validates all the taken objects and return them to the pool, one by one - special care here with error handling

I have implemented a quick proof of concept, here a fragment corresponding to the logic described above (perhaps not the most elegant but OK for PoC/illustration purposes):

    private void doHouseKeeping()
    {
        log.info("Doing housekeeping");
        Collection<T> toValidate = new ArrayList<>();

        while (true) {
            T object = pool.tryTake();
            if (object == null) {
                log.info("Could not get more (got [{}]), breaking", toValidate.size());
                break;
            }

            toValidate.add(object);
        }

        if (toValidate.isEmpty()) {
            log.info("Nothing to validate, returning");
            return;
        }

        log.info("Beginning validation of [{}]", toValidate.size());
        for (T obj : toValidate) {
            validate(obj);
            pool.restore(obj);
        }
    }

The biggest flaw I see to this approach is that it may unnecessarily validate objects that have just been returned to the pool, but even though it is also acceptable in my case, it could be easily mitigated by conditionally validating the object only if it was used x time ago.

I have tested this proof of concept in a concurrent scenario with threads taking/restoring from/to the pool while interfering with the housekeeping task and it works fine so far, though I still will do more tests to make sure it performs and behaves correctly.

Any advice would be appreciated.

Cheers.

simeonmalchev commented 6 years ago

Sorry for the delay @agseco, it took me some time to come to this.

I'll offer you below a couple of options for how you can possibly re-write the doHouseKeeping() function in order to avoid the side-effects you mentioned, but I'll still suggest you to consider validating your pooled objects just before use depending on how long time they have stayed in the pool, in order to completely avoid the taking/returning of objects to the pool for purely validation purposes.

However, if you still need to have the explicit validation you can try something like:

    private void doHouseKeeping()
    {
        log.info("Doing housekeeping");
        Collection<T> toValidate = new ArrayList<>();

        int createdObjects = pool.remainingCreated();
       // this loop will iterate only over the currently created in the pool objects
        for (int i =0; i < createdObjects; i++)  { 
            T object = pool.tryTake();
            if (object == null) {
                break;
            }

            toValidate.add(object);
        }

        // The problem here is that at this moment we have taken all currently created objects from the     
        // pool, which number could be equal to the pool capacity, in which case we have exhausted 
        // the pool and until we validate and return an object to the pool all taker threads (if any) will 
        // be blocked waiting. 

        if (toValidate.isEmpty()) {
            log.info("Nothing to validate, returning");
            return;
        }

        log.info("Beginning validation of [{}]", toValidate.size());
        for (T obj : toValidate) {
            // the validate() method should employ proper socket timeouts to avoid long waits
            boolean valid = validate(obj); // I'm assuming that validate() does return a boolean result
            pool.restore(obj, valid);
        }
    }

Alternatively, if want to avoid the possible problem with exhausting the pool which I mentioned in the middle of the above function, you can try something like:

    private void doHouseKeeping()
    {
        log.info("Doing housekeeping");

        int createdObjects = pool.remainingCreated();
       // this loop will iterate only over the currently created in the pool objects
        for (int i =0; i < createdObjects; i++)  { 
            T object = pool.tryTake();
            if (object == null) {
                break;
            }

            // See the comments after the function source...

            // the validate() method should employ proper socket timeouts to avoid long waits
            boolean valid = validate(obj);  // I'm assuming that validate() does return a boolean result
            pool.restore(obj, valid);
        }
    }

In the above case is very important that the pool has been created with ConcurrentLinkedQueueCollection not with ConcurrentLinkedDequeCollection. I.e., the pool is created with something like:

pool = new ConcurrentPool<>(new ConcurrentLinkedQueueCollection<>(), new SomeObjectFactory(), initialSize, maxSize, true);

When the pool is created with ConcurrentLinkedQueueCollection the internal structure used by the pool to store objects will be a queue (LIFO), i.e., the last taken object (for validation or use) will be returned at the end of the queue. Alternatively, if the pool was created with ConcurrentLinkedDequeCollection the internal structure used by the pool to store objects will be a stack (FIFO), i.e., the last taken object (for validation or use) will be returned at the front of the stack and will be the first used/taken again via the next call to take().

Hope this helps a bit.

agseco commented 6 years ago

Hi @simeonmalchev,

Many thanks for your detailed feedback/response, I really appreciate it 😍

Just to clarify, I'm now realising that I poorly named the validate(obj) method; my intent was to use a generic name, but here is were I would do the keep alive WS call to prevent the pooled sessions from expiring. This is a hard requirement and it's imposed by the Web Service provider (Sabre).

Finally, I just got a bit confused by the last part; I guess that you wanted to mean that ConcurrentLinkedQueueCollection is FIFO and ConcurrentLinkedDequeCollection is LIFO (stack), is this correct?

Cheers.

simeonmalchev commented 6 years ago

Yes :) ConcurrentLinkedQueueCollection is FIFO and ConcurrentLinkedDequeCollection is LIFO (stack), I mixed this up.

agseco commented 6 years ago

Thanks for your support & advice @simeonmalchev, I really appreciate it. And congrats again for your work ;)

Best.