Open leezu opened 6 years ago
Hi Leezu, thanks for the interest in my library and in my research. Currently I am on vacation, but I will get back to you once I have returned (the week after next).
Approximate Counters seem like a very valid storage option, and I think there could be other cases where updates might have to be cancelled, e.g., when computing the min of different values. I will try to make the feature available in some way.
The case with complex keys is somewhat more difficult. Currently there is no short term plan, to make this feature available. The main problem is the interface -- how to reduce copies/how to handle pointer ownership. If I have a good idea, this could happen sooner rather than later, but there are no immediate plans.
Cheers, Tobias'
Hey Tobias, first thanks a lot for your work on this library!
I noted you made extensive experiments in your paper regarding the performance under contention for a subset of keys (ie. Zipf distributed keys). In the particular case where one wants to count the occurrence of the keys in some dataset, contention can be reduced by using statistical counters (e.g. Dice et. al. "Scalable statistics counters."). Such counters provide a tradeoff between statistical guarantees about the accurateness of a count versus the number of writes necessary to the counter. In the concurrent case the number of writes is the number of CAS operations. For more contended counters, the probability of requiring a write during counter increment decreases.
The increment opreation of such counter is quite simple to implement:
Based on a random number, either an update of the stored value is performed or the stored value is left as it is. In case of CAS failure the operation is retried.
It would be very useful, to store such statistical counters as values in a concurrent hash map. However, in the current implementation, whenever
update
orinsertOrUpdate
is called, CAS will be executed. The UpdateFunction has no option to cancel the update.I believe only few changes would be required to make this work though. 1) UpdateFunction must return bool to indicate if the update shall be cancelled or not 2) In
TAtomic::execute
(for SimpleElement) andatomicUpdate
(for MarkableElement) check the return value of UpdateFunction and skip the CIS iffalse
is returned (still indicating that the update was successful).Would you suggest any other / better solution? Would you welcome a contribution to the library to implement this feature?
Unrelated: Your paper describes a strategy to support complex key and value types. Do you have any plans to support it for growt? I have a use-case for complex keys (strings) and if it is feasible without too many changes, would be happy to use / adapt growt for that.