Open andrii0lomakin opened 7 years ago
It's not clear to me what do you mean for "cluster records". Do you mean cluster in the OrientDB meaning? So the entire content of hot clusters go on the same pages?
Or do you rather mean cluster of records as a group of records that are correlated and frequently accessed together?
Hi @lvca thank you for reading. By cluster records, I mean a group of records. I will remove "cluster" from a title, I suppose it will make the title more comprehensible )).
@smolinari :-)
Reference:
https://github.com/orientechnologies/orientdb-labs/blob/master/OEP_15.md
Summary:
Reallocate hot records to the same pages with the help of Stream-Summary algorithm
Goals:
Minimize read and write amplification to keep all hot records together inside of disk pages
Non-Goals:
None
Success metrics:
Improved speed of benchmarks which are based on skewed data distributions similar to Zephian distribution.
Motivation:
During profiling of YCSB tests we found that:
Data cluster pages are not cached because cold and hot pages are spread randomly between pages as result disk cache efficiency hurts.
We also may look at this from another point of view. When hot records are mixed with cold records on the same page, we write and read much more data for the same time interval but a speed of the disk is limited.
Description:
We may use the Stream-Summary algorithm to calculate top s clusters are used in storage and for those clusters keep top k records accessed by users during read/update operations.
Stream-Summary has an interesting property. This algorithm not only calculates top k items but also provides a level of error in such calculations.
Once we calculate top s clusters with applicable error, we start to calculate top records accessed in those clusters. Items which associated with each record will contain the following information:
Also, we are going to support information about a set of pages which contain top-k records.
Beside of pages indexes, the following information is supposed to be held in each item of such set:
Data structures mentioned above will be updated during read/create/update/delete operations.
We will update following information:
Once we identify top k records with applicable error, we may relocate them in such manner that all of them will be put together in disk pages.
There are following approaches which are proposed to use for records reallocation.
When record is updated/read:
Because the distribution of hot records may be changed dynamically, there is a risk that we only introduce write overhead in such case. To mitigate such risks following limits will be applied:
In such case, all loads with stable or slow changing distribution of data will get a noticeable increase in latency and throughput numbers from one side. And from another side, all loads which change distribution of their data very quickly will not suffer from often record reallocations.
About real values of top K records and top S cluster. I propose to track top 100 clusters and top 10 000 records.
What it gives to us. Obviously, all benchmarks (they use a static law of distribution) will benefit from such strategy. But despite marketing reasons I suppose there are many applications when a law of data distribution is quite stable which also may benefit from proposed approach.
Alternatives:
None
Risks and assumptions:
Impact matrix