etolabo / kumofs

kumofs is a scalable and highly available distributed key-value store.
http://kumofs.sourceforge.net/
Other
301 stars 16 forks source link

Concurrency Control #1

Closed astro closed 14 years ago

astro commented 14 years ago

I know this is a very difficult topic, but I take from the source that concurrency control is already implemented on the server-side using timestamps.

Providing these mechanisms to the client-side with memcached's ADD/REPLACE and CAS values would broaden the spectrum of applications that can be made with KumoFS.

Thanks for that great software!

frsyuki commented 14 years ago

Thank you for your message. It is certain that the topic is very difficult topic.

It is true that a consistency control algorithm is implemented. It uses a combination of UNIX time (long-term but low precision) and Lamport's logical clock (high precision but short-term). The combined timestamp is used on replacing.

Kumofs moves stored data from a server to another server when the number of servers is changed (you attach new servers or detach crashed servers). I call the operation "replace".

And the server accepts Set operations from the clients during replacing. This means that the server has to decide which data (received from client and received from another server) is newer. The timestamp is used for the decision and it works fine.

The problem is that atomic operations (ADD, REPLACE, APPEND, CAS, etc.) require old values to set new value. For example, CAS operation requires old value to compare. ADD requires all old values to know there are no values associated to the key. Timestamp solve the version comparison of the stored values but doesn't solve this problem. Vector Clock also doesn't solve the problem.

One method to support these operations is that a server waits (stops) to finish the replacing and accepts Set operations after receiving all old values. But this method makes the availability worse.

Another method is that you accept some inconsistencies. For example, Two ADD operations succeeded concurrently (one of them should be failed) during replacing and one data is deleted. Hmm... I think it is not always acceptable.

astro commented 14 years ago

A case where it isn't acceptable: assume you use one key as a serialized list, there is no way to do an atomic push() or pop().

Can I assume that the is always one determinable master for a given key in the keyspace? Replace only happens on topology change?

If so, it would be possible to use at least CAS in some limited fashion. Unfortunately KumoFS' vector clocks are 64 bit while the CAS field in the memcached binary protocol is 32 bit.

frsyuki commented 14 years ago

Yes. Replace only happens on topology change. And finally, I've implemented CAS operation and released kumofs-0.4.0 :-)

The semantics of kumofs' CAS operation is that "the swapping always fails if the comparison fails". This means that the swapping may not succeed if the comparison succeeds. This restriction is caused by replacing. You are required to retry the operation if the swapping is failed. In addition, CAS on binary protocol is not supported.

I think these restrictions are reasonable for many cases, how do you think?

astro commented 14 years ago

That's plain awesome. Thank you!

I misread the MemcacheBinaryProtocol spec, CAS values are actually 64 bit indeed. That means the feature could be supported by memcache_binary.cc too.

I think memcached ignores CAS values of 0 and therefore overwrites a key no matter of its presence.

frsyuki commented 14 years ago

I've released kumofs-0.4.1. It supports CAS operation on binary protocol :-)

astro commented 14 years ago

Thank you very much!

http://blog.superfeedr.com/Database/NoSQL/OSS/open-source/kumofs-a-database-success-story/