Open agnesnatasya opened 2 years ago
Adding Jongyul (@yulistic), who should be able to tell you about our Ceph/NFS configuration.
For Ceph,
For NFS,
For both,
echo 3 > /proc/sys/vm/drop_caches
.Hi Jongyul,
Thank you very much for the details of the benchmark. Do you mind if I clarify about the experimental setup for measuring read latency and LevelDb performance specifically?
LevelDB Application Benchmark
bench/leveldb/mlfs
directory using the command ./run_bench.sh fillseq
, and the program seems to hang.
For information, I am able to run and complete if Assise uses 2 different hot replicas.Read Microbenchmark
fio --name=test --bs=<bs> --readwrite=write --size=1G --filename=/mnt/local_dir/bench_1.txt # write
fio --name=test --bs=<bs> --readwrite=read --size=1G --filename=/mnt/local_dir/bench_1.txt #immediately read
fio --name=test --bs=<bs> --readwrite=write --size=1G --filename=/mnt/local_dir/bench_1.txt
echo 3 > /proc/sys/vm/drop_caches
fio --name=test --bs=<bs> --readwrite=read --size=1G --filename=/mnt/local_dir/bench_1.txt
MLFS_DIGEST_TH=100 ./run.sh iobench_lat wr 1000M <BS> 1
./run.sh iobench_lat sw 1000M <BS> 1 # writes so that the file exist
./run.sh iobench_lat sr 1000M <BS> 1 # read from another node.
Is the above configuration correct as the result that I got from the above tests are different from the one presented in the paper, and it does not reflect hit and miss in most cases. The results that I have are as follows
May I ask for help regarding the discrepancies on benchmark results as stated above? Thank you!
Please compare your results with the raw latency of DRAM and your IB or RoCE network. The miss latency should include the network crossing overhead which is much higher than local DRAM access (hit latency). Then, you will be able to figure out which configuration (miss or hit) is incorrect.
The numbers were measured with microbenchmarks: bench/micro. It seems better to use the same benchmark to reproduce the numbers.
Mentioning @wreda for Assise results.
I am running Assise using 3 different hot replicas. When I set up using 3 replicas, Assise seems to stuck at some point of the replication. I tested using the normal iotest and also LevelDb. I run the LevelDb benchmark under
bench/leveldb/mlfs
directory using the command./run_bench.sh fillseq
, and the program seems to hang. For information, I am able to run and complete if Assise uses 2 different hot replicas.
This could be a bug in the 3-replica configuration. Feel free to open another issue for this with error logs/stack trace, and I'll be happy to take a look.
Read Microbenchmark
Assise: Run on 2 hot replicas
HIT
Node 1
MLFS_DIGEST_TH=100 ./run.sh iobench_lat wr 1000M <BS> 1
MISS
Node 1
./run.sh iobench_lat sw 1000M <BS> 1 # writes so that the file exist
Node 2
./run.sh iobench_lat sr 1000M <BS> 1 # read from another node.
Is the above configuration correct as the result that I got from the above tests are different from the one presented in the paper, and it does not reflect hit and miss in most cases. The results that I have are as follows
- Assise: The result does not seem correct as the Hit method might have a higher latency than the Miss method.
May I ask for help regarding the discrepancies on benchmark results as stated above? Thank you!
The configuration for Assise looks fine. Note, however, that for higher IO sizes (> 4KB) you might experience worse performance in the HIT case since it needs to do multiple hash table lookups. In any case, you can try running LibFS with profiling enabled for both HIT and MISS: MLFS_PROFILE=1 ./run.sh iobench_lat
. This will provide a more fine-grained performance breakdown, which might help us pinpoint the cause of the discrepancy.
EDIT: Upon further thought, the HIT performance is likely worse here since your file size is not small enough to fit inside the log (assuming you're using the default log size of 1 GB). This causes the file to spillover to the other caches. To avoid this, either reduce your file size or increase the log size. For example, at 4 KB IO and a log size of 1 GB, your file should be ≤ 256 MB (to account for any metadata overheads).
Hi,
I am interested in replicating the benchmark setup as detailed in the Assise paper, and I would like to ask some details about the NFS and CephFS configuration.
In the experimental configuration part , it is stated that
For Ceph,
For NFS,
For both,
Thank you very much for the kind help!