goncalotomas / FMKe

🛠️ Realistic benchmark for key value stores
Other
23 stars 8 forks source link

Possible incorrect state with concurrent operations goes undetected #172

Closed goncalotomas closed 6 years ago

goncalotomas commented 6 years ago

The first approach to implement a working driver for FMKe used nested CRDTs such as counters, registers in maps inside top level CRDT maps in order to emulate entity objects.

Some time later due to poor performance with nesting state-based CRDTs, a normalized data model was devised that mainly split state across several keys to avoid deep nesting.

In ether version, when concurrent writes are made to the applications, they may cause an incorrect state such as the following trace:

Patient at pharmacy gets drugs from prescription - happens at DC1 Doctor adds drugs to the prescription - happens at DC2

The final state will include information about the processing of the prescription, which in effect should not allow drugs to be added. Not only that, but the end result will make it seem like the patient was given the drug that was added concurrently. There are other similar issues that only occur in the normalized data model, where multiple entities keep record of prescriptions associated to them, separated by their processed and open status.

Currently we use a flag inside the prescription object to mark it as processed (closed), but we can consider this flag to be derived state if we model a prescription as having a list of dispensed medication and another list of drugs that have not yet been obtained. It will therefore be considered closed when the list of undispensed drugs is empty. Not having the flag however, imples that operations such as get_processed_prescriptions will involve traversing the entire key space if no indices are available. This is a bad thing for most key value stores out there.

goncalotomas commented 6 years ago

This depends on the configuration (i.e. the driver and database used), as different storage systems provide different guarantees. In the benchmark documentation we must mention any observed anomalies, but it is not our concern to classify a database as "not consistent enough". It will be up to the developers testing these systems to decide whether those anomalies are acceptable or not.