Closed isaac-io closed 2 years ago
@isaac-io & I have discussed and agreed on the following: We will add a new configuration parameter to db_bench. That parameter will allow the user to set the range of random keys to be used in benchmarks such as fillrandom and readrandom. The default of the new parameter will have the range equal to the number of keys, which is the current behaviour => no change of behaviour by default.
Note that currently db_bench
simply divides the key space between threads evenly, so care should be taken to divide the range between the threads, rather than the amount of keys for the benchmark as is done today.
EDIT: I seem to have confused db_bench
and db_stress
. db_bench
doesn't need to track expected values, so it doesn't divide the key space between the threads as db_stress
does.
Following a discussion with @isaac-io, it seems db_bench already has 2 existing parameters that users may use to achieve the same purpose: 'reads' / 'writes'. These parameters, when specified, control the number of keys (when not specified, the number of keys is set by the 'num' parameter. So, a user may specify both 'num' and 'reads' / 'writes'. The 'num' will be used to control the range of keys and the 'reads' / 'writes', their number.
Can we close this issue then? Should we run the paired bloom filter benchmark with these settings in order to ensure that it works before we close?
@erez-speedb - Could you please try to use these parameters and see if indeed these parameters enable us to get what we want?
With num=$(($rows 10000)) readrandom : 2.350 micros/op 1702456 ops/sec; 0.0 MB/s (1259 of 19126999 found) With reads=$(($rows 10000)) readrandom : 8.687 micros/op 460468 ops/sec; 75.5 MB/s (3279451 of 5186999 found) @udi-speedb using the "-reads" flag is good enough and the test was updated accordingly. Please consider reverting the db_bench change.
Verified as working with the existing parameters.
Currently
db_bench
doesn't allow controlling the range of the keys that's being read during a read workload, so for the new paired bloom filter (#29) this causes the workload to bypass the filter completely in case the keys aren't in the range of the data in the database.Add an option to restrict the key generation so that all of the keys are generated in the range during a read workload, so that the filter paths will be hit and we would be able to measure the impact of the changes in a real world scenario.