byzhang / leveldb

Automatically exported from code.google.com/p/leveldb
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Tunable allowed_seeks - feature request #223

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The 'allowed_seeks'  is asigned to file_size /16KB, which is only fine to some 
specific circumstance.  

Here is the assumptions in code:
"
// We arrange to automatically compact this file after
// a certain number of seeks. Let's assume:
// (1) One seek costs 10ms
// (2) Writing or reading 1MB costs 10ms (100MB/s)
// (3) A compaction of 1MB does 25MB of IO:
...

"

About assumption (1) :

 A get operation which seeks several files does not means the seek on first file actually seek disk. It's very likely that bloom filter told us the data we want is not in that file, and filter data itself is likely in ram while table cache is big enough. On the other hand, even if the result of bloom filter is false positive, the file data is likely in leveldb block cache or system page cache.  So a seek does not necessarily cost 10ms.

In some read-heavy workload, some people simply disable compaction triggered by 
read, while this may degrade read performance.

My suggestion is how about a tunable allow_seeks value that can be set 
according to specific circumstance based on measurement? 

Thanks in advance.

Original issue reported on code.google.com by alghak@gmail.com on 14 Jan 2014 at 10:00