Closed xxks-kkk closed 6 years ago
Hi Zack,
PebblesDB does provide the metric of total IO in the form of stats. You can use "stats" as one command in the list of benchmarks provided (https://github.com/utsaslab/pebblesdb/blob/09c706d7aa2977c1316b2d64b81f1f1b6508002d/db/db_bench.cc#L66) to print the DB stats or by accessing db->GetProperty("leveldb.stats") if not running db_bench. It tracks the amount of reads and writes for each level of the LSM but I believe the stats do not get persisted between runs (if you open an existing data store, the stats comprise of only the stats from the time the db was opened and not the entire history).
Although, all the IO numbers reported in the paper were obtained using iotop tool which provides more accurate IO numbers and can be aggregated across runs and different levels of LSM easily.
Note that both these methods give you only the total amount of read/write IO while you will have to calculate the write amplification manually which depends on the amount of user data inserted.
Thanks!
Hello Pandian,
Thanks for the info! I'll try it out and reopen the issue if I need further help! Thanks much!
Sure, anytime!
Hello Pandian,
I just want to confirm that there is no out-of-box script that is used to calculate write amplification using iotop? Also, I run the command
./db_bench --benchmarks=fillrandom,readrandom,stats --num=1000000 --value_size=1024 --reads=500000 --db=/tmp/pebblesdbtest-1000
and the output of reads/writes of LSM at each level is following:
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 0 0 13 0 1007
1 0 0 6 1007 994
2 1 64 5 994 976
3 3 131 4 912 877
4 8 419 3 747 722
5 4 280 2 473 451
In this case, how do I calculate the write/read amplification? For level 3, the write amplification is
877/131 = 6.7
?
Thanks!
Hi Zack,
There is no out-of-box script to calculate the write amplification using iotop but it should be trivial to do so. Start an iotop in the background before the benchmark starts, redirect the output of iotop to a file, and kill the iotop at the end of benchmark. From the log file, you can add up the total write IO/read IO to get the final metric. iotop has many options, for example, to print IO of only processes consuming IO, aggregating the IO and printing etc. We used iotop -btoqa | grep db_bench
and calculated the total IO from the corresponding output file.
And, from your sample output, you are right about the write amplification. Although, it might make more sense to get the cumulative write amplification of the entire LSM rather than for each level.
Thanks, Pandian
Hi Pandian,
So for the cumulative write amplification in this case would be (1007 + 994 + 976 + 877 + 722 + 451) / (64 + 131 + 419 + 280)
?
Thanks!
Yes, you are right.
Thanks much!
I'm playing around with the source code and I'm curious if the source code supports write amplification calculation out-of-box? If so, can you point which part does the calculation?
Thanks!