DKU-StarLab / leveldb-study

LevelDB Analysis, Backgrounds, Practice and Tuning
https://sslab.dankook.ac.kr/leveldb-wiki/
35 stars 15 forks source link

Week3 Homework #6

Closed min-guk closed 2 years ago

min-guk commented 2 years ago

Please submit through google form until next Monday, 7/18 12PM

1. Why do LSM-tree and LevelDB use leveled structure?

Hint 1 - Stackoverflow

Hint 2 – Memory hierarchy Hint 3 - Patrick O'Neil, The Log-Structured Merge-Tree (LSM-Tree), 1996

2. In leveldb, max size of level i is 10^iMB. But max size of level 0 is 8MB. Why?

Hint 1 - leveldb source code

Hint 2 - leveldb-handbook, Compaction (Use google chrome translator)

3. Practice 1

[A] $ ./db_bench --benchmarks="fillseq" 
[B] $ ./db_bench --benchmarks="fillrandom"

Q1. Compare throughput, latency, and stats of two benchmarks and explain why. Hint - Seek Time, Key Range, Compaction

Q2. In benchmark A, SSTs are not written in L0. Why? Hint - Flush, Compaction Trigger

Q3. Calculate SAF (Space Amplification Factor) for each benchmark. Hint - db_bench meta operation

4. Practice 2

[Load] $ ./db_bench --benchmarks="fillrandom" --use_existing_db=0

[A] $ ./db_bench --benchmarks="readseq" --use_existing_db=1
[B] $ ./db_bench --benchmarks="readrandom" --use_existing_db=1
[C] $ ./db_bench --benchmarks="seekrandom" --use_existing_db=1

Note - Before running A, B, and C, run db_load benchmark.

Q1. Which user key-value interface does each benchmark use? (Put, Get, Iterator, ...) Hint 1 - leveldb/doc/index.md Hint 2 - _leveldb/benchmarks/dbbench.cc

Q2. Compare throughput and latency of each benchmark and explain why. Hint - Seek Time

5. Practice 3

[A] $ ./db_bench --benchmarks="fillrandom" --value_size=100 --num=1000000 --compression_ratio=1
[B] $ ./db_bench --benchmarks="fillrandom" --value_size=1000 --num=114173 --compression_ratio=1

Note 1. key_size = 16B Note 2. same total kv pairs size. Note 3. # of B's entries = 114173 = (16+100)/(16+1000) * 1000000

Q. The size of input kv pairs is the same. But One is better in throughput, the other is better in latency. Explain why. Hint. Batch Processing

min-guk commented 2 years ago

Of course, it is okay to answer in Korean.

min-guk commented 2 years ago

When leveldb build is done, please check installation with $ ./db_bench, not $ db_bench.

Do not install rocksdb db_bench with sudo apt install rocksdb-tools

min-guk commented 2 years ago

More hints for homework has been updated.

min-guk commented 2 years ago

When studying the leveldb code such as leveldb/benchmarks/db_bench.cc, please use VScode "Go to Definition(F12)" and "Go to References(Shift+F12)" features.

min-guk commented 2 years ago

Hints for question 2 has been updated.

min-guk commented 2 years ago

There was a mistake in question 5 and it has now been corrected. Please, check again.

min-guk commented 2 years ago

[Deadline Extension]

Deadline has been extended until 7/18 12 PM. It's okay if you can't answer all the questions, please submit within the extended deadline.

min-guk commented 2 years ago

Great work everyone!

Homework solutions have been uploaded. Individual solutions will be presented today so that everyone can have a reference on how the others addressed the homework.

And also, you can check how other students answered question.

Feedback for your submitted solution will be given starting tomorrow. See you later!