arthurnn / memcached

A Ruby interface to the libmemcached C client
Academic Free License v3.0
432 stars 127 forks source link

Add support for multi-entry CAS #152

Closed fbogsany closed 10 years ago

fbogsany commented 10 years ago

Memcached.cas only supports CAS on a single entry, requiring many roundtrips for bulk operations.

This adds support for multi-entry CAS using memcached_mget and memcached_fetch_rvalue. The API:

@arthurnn, @camilo review pls?

camilo commented 10 years ago

cc/ @tobi

tobi commented 10 years ago

:+1:

camilo commented 10 years ago

@csfrancis halp with this since @arthurnn is not around

csfrancis commented 10 years ago

This looks good to me. :sailboat:

camilo commented 10 years ago

:sheep: when travis is happy

arthurnn commented 10 years ago

I am not 100% happy with the code TBH, but I cannot think in a better way to do this =/ .. @evan any thoughts on this?

evan commented 10 years ago

Some comments:

There's a lot of redundant code between get()/cas() and singlegetcas()/multigetcas()/multicas(). I would modify the existing get() implementation to always track cas values if cas is enabled on the client, and conform to the current API by wrapping it with an outer method. Then I would modify cas() to dispatch to the inner get() method in both cases, which is similar to what you have.

I think the API for multicas is wrong. Similar to multiget, I would suggest having it return a hash of keys and values that were successfully updated, and elide any that returned DataExists. This way the hash can be used to update a process cache, or inspected to see which keys were missing.

Performance will be poor until the cas operation itself is pipelined. That will require changes in the C extension.

We will need to do a performance pass but we need to get the interface and organization correct first.

It would be nice to implement multiset/multiadd/multiincr and friends as well according to a similar strategy, so that the API remains regular.

fbogsany commented 10 years ago

Updated with suggested refactoring.

The 1.8.7 build failure in https://travis-ci.org/evan/memcached/builds/24119997 is unrelated to these changes. I'm not able to restart the build, though.

evan commented 10 years ago

This looks better to me. I would move the retry logic into the outer get() and cas() methods for simplicity. Also, I wouldn't delegate to single_cas() within multi_cas(). The exception throw and rescue is unnecessary work.

fbogsany commented 10 years ago

Moving the retry logic to the outer get() is fine, but the single_cas() calls should be retried separately (from each other and from the get() calls). Each call to Lib.memcached_cas may fail, and the whole *_get()/cas() sequence shouldn't be retried for a single failure.

If multi_cas doesn't delegate to single_cas, the retry logic could get complicated. A single failed Lib.memcached_cas call should be retried independently. Should only options[:exception_retry_limit] tries be attempted per multi_cas, or per Lib.memcached_cas call?

evan commented 10 years ago

[Edited]

I think you are right because a successful individual cas operation can't be re-run. Restarting the operation and succeeding the second time would return a different result set. However, the total retried errors should be per-function, not per-cas, otherwise worst-case latency is unpredictable.

What will happen if some keys passed to the initial cas multiget are not found?

fbogsany commented 10 years ago

Limited total retried errors per-function.

If keys passed to the multiget aren't found, they're not yielded from cas. If a key then expires before the memcached_cas call, it won't be updated and won't be included in the results returned from cas.

arthurnn commented 10 years ago

cc @shuhaowu