brianfrankcooper / YCSB

Yahoo! Cloud Serving Benchmark
Apache License 2.0
4.96k stars 2.26k forks source link

understanding hdrhistogram results #1059

Open karthikrish9 opened 7 years ago

karthikrish9 commented 7 years ago

Hi all,

I want to measure latency of Memcached DB with a working set greater than amount of RAM that I have (Forcing the system to use swap).

I would like the results in HDRhistogram format. Here is what I have:

Configured the Memcached server to use 1GB RAM (total amount of the RAM I have).

memcached -d -m 1024 -l 127.0.0.1 -p 11211 Running YCSB client to load data (1GB data):

./bin/ycsb load memcached -s -P workloads/workloada -p memcached.hosts=127.0.0.1 -p basicdb.verbose=false -p basicdb.simulatedelay=4 -p measurement.interval=both -p measurementtype=hdrhistogram -p hdrhistogram.fileoutput=true -p maxexecutiontime=600 -p recordcount=400000

400000 inserts fills up 1GB memory forcing to use the swap. Is this the right way to specify the working set?

Then, I ran following to find the latency and throughput in this memory constrainted env.

./bin/ycsb run memcached -s -P workloads/workloada -p memcached.hosts=127.0.0.1 -p basicdb.verbose=false -p basicdb.simulatedelay=4 -p measurement.interval=both -p measurementtype=hdrhistogram -p hdrhistogram.fileoutput=true -p maxexecutiontime=900 -p operationcount=800000

It generated following four files:

  - Intended-READ.hdr     
  - READ.hdr
  - INSERT.hdr   
  - Intended-INSERT.hdr  
  - Intended-UPDATE.hdr  
  - UPDATE.hdr

I don't understand what is the difference between Intended-READ.hdr and READ.hdr. Both gives me same graph.

Following is how I generated the graph to generate latency of read option:

./HistogramLogProcessor -i ../READ.hdr -o read
./HistogramLogProcessor -i ../Intended-READ.hdr -o Iread

Then I used http://hdrhistogram.github.io/HdrHistogram/plotFiles.html to plot both iread.hdr and read.hrd shows following same graph. Is there anything wrong with what I am doing?

histogram

c15yi commented 6 years ago

I think the intended measurements should also measure the time spend in the workload to prepare the db operation.

See Measurements::setIntendedStartTimeNs and its usage. Also Measurements::getIntendedtartTimeNsand its usage in the DBWrapper class.

Since the set method is not used anywhere else than in Client.ClientThread::run it does not measure anything.

Of course, I could also understand that code completely wrong and it does measure something.

busbey commented 6 years ago

got time to help @nitsanw?