zk1931 / jzab

ZooKeeper Atomic Broadcast in Java
http://zk1931.github.io/jzab/master/
Apache License 2.0
54 stars 23 forks source link

Add RollingLog implementation. #133

Closed EasonLiao closed 10 years ago

EasonLiao commented 10 years ago

@m1ch1 Addressed all the problems except the trim implementation(But I added the interface). It's not hard but I'll add it in separate commit since I think how we implement it depends how we handle snapshot.

Also I added the RollingLog to SyncProposalProcessorTest and test it on my vintage laptop :

For Simple log :

22 ops/sec (txnSize : 1024, txnCount : 1000, batchsize : 1)
19769 ops/sec (txnSize : 128, txnCount : 100000, batchsize : 1000)
10341 ops/sec (txnSize : 1024, txnCount : 100000, batchsize : 1000)

For RollingLog with rolling size 128MB (Actually it's just one log file, so it's similar with simple log):

27 ops/sec (txnSize : 1024, txnCount : 1000, batchsize : 1)
19027 ops/sec (txnSize : 128, txnCount : 100000, batchsize : 1000)
10343 ops/sec (txnSize : 1024, txnCount : 100000, batchsize : 1000)

For RollingLog with rolling size 10MB (After test, there're 12 log files for RollingLog)

27 ops/sec (txnSize : 1024, txnCount : 1000, batchsize : 1)
18318 ops/sec (txnSize : 128, txnCount : 100000, batchsize : 1000)
8412 ops/sec (txnSize : 1024, txnCount : 100000, batchsize : 1000)

The performance degrades a little bit.

ghost commented 10 years ago

thank you for running the experiment. the overhead is higher than expected at 10mb rolling size. i'm curious where the latency is coming from.

EasonLiao commented 10 years ago

Yesterday's result might not be correct since we use same log directory for each benchmark (txnSize, txnCount, batchsize). Here is the new result :

2014-08-18 12:07:12,808 INFO  main org.apache.zab.TestBase:52 [||] STARTING: testPerformance[0](org.apache.zab.SyncProposalProcessorTest)
2014-08-18 12:07:12,815 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.SimpleLog
2014-08-18 12:07:12,815 INFO  main ROOT:87 [||] epoch       = 0
2014-08-18 12:07:12,816 INFO  main ROOT:88 [||] txn size    = 1024
2014-08-18 12:07:12,816 INFO  main ROOT:89 [||] num_ops     = 1000
2014-08-18 12:07:12,816 INFO  main ROOT:90 [||] batch_size  = 1
2014-08-18 12:07:13,028 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60000
2014-08-18 12:07:49,509 INFO  main ROOT:144 [||] runtime     = 36.311368169 seconds
2014-08-18 12:07:49,509 INFO  main ROOT:145 [||] throughput  = 27 ops/sec
2014-08-18 12:07:49,510 INFO  main ROOT:146 [||] throughput  = 0.026894125703413124 MB/sec
2014-08-18 12:07:49,510 INFO  main ROOT:148 [||] ack count   = 1000
2014-08-18 12:07:49,516 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.SimpleLog
2014-08-18 12:07:49,517 INFO  main ROOT:87 [||] epoch       = 1
2014-08-18 12:07:49,517 INFO  main ROOT:88 [||] txn size    = 128
2014-08-18 12:07:49,518 INFO  main ROOT:89 [||] num_ops     = 100000
2014-08-18 12:07:49,518 INFO  main ROOT:90 [||] batch_size  = 1000
2014-08-18 12:07:49,524 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60001
2014-08-18 12:07:54,693 INFO  main ROOT:144 [||] runtime     = 5.116522049 seconds
2014-08-18 12:07:54,694 INFO  main ROOT:145 [||] throughput  = 19544 ops/sec
2014-08-18 12:07:54,695 INFO  main ROOT:146 [||] throughput  = 2.385806439041107 MB/sec
2014-08-18 12:07:54,696 INFO  main ROOT:148 [||] ack count   = 101
2014-08-18 12:07:54,702 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.SimpleLog
2014-08-18 12:07:54,703 INFO  main ROOT:87 [||] epoch       = 2
2014-08-18 12:07:54,703 INFO  main ROOT:88 [||] txn size    = 1024
2014-08-18 12:07:54,704 INFO  main ROOT:89 [||] num_ops     = 100000
2014-08-18 12:07:54,704 INFO  main ROOT:90 [||] batch_size  = 1000
2014-08-18 12:07:54,712 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60002
2014-08-18 12:08:04,658 INFO  main ROOT:144 [||] runtime     = 9.886404693 seconds
2014-08-18 12:08:04,659 INFO  main ROOT:145 [||] throughput  = 10114 ops/sec
2014-08-18 12:08:04,660 INFO  main ROOT:146 [||] throughput  = 9.877832542010427 MB/sec
2014-08-18 12:08:04,660 INFO  main ROOT:148 [||] ack count   = 100
2014-08-18 12:08:04,672 INFO  main org.apache.zab.TestBase:68 [||] SUCCEEDED: testPerformance[0](org.apache.zab.SyncProposalProcessorTest)

2014-08-18 12:08:04,677 INFO  main org.apache.zab.TestBase:52 [||] STARTING: testPerformance[1](org.apache.zab.SyncProposalProcessorTest)
2014-08-18 12:08:04,680 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.RollingLog
2014-08-18 12:08:04,680 INFO  main ROOT:87 [||] epoch       = 0
2014-08-18 12:08:04,681 INFO  main ROOT:88 [||] txn size    = 1024
2014-08-18 12:08:04,681 INFO  main ROOT:89 [||] num_ops     = 1000
2014-08-18 12:08:04,681 INFO  main ROOT:90 [||] batch_size  = 1
2014-08-18 12:08:04,688 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60003
2014-08-18 12:08:41,819 INFO  main ROOT:144 [||] runtime     = 37.08155812 seconds
2014-08-18 12:08:41,819 INFO  main ROOT:145 [||] throughput  = 26 ops/sec
2014-08-18 12:08:41,819 INFO  main ROOT:146 [||] throughput  = 0.026335530369024314 MB/sec
2014-08-18 12:08:41,819 INFO  main ROOT:148 [||] ack count   = 1000
2014-08-18 12:08:41,837 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.RollingLog
2014-08-18 12:08:41,838 INFO  main ROOT:87 [||] epoch       = 1
2014-08-18 12:08:41,838 INFO  main ROOT:88 [||] txn size    = 128
2014-08-18 12:08:41,838 INFO  main ROOT:89 [||] num_ops     = 100000
2014-08-18 12:08:41,839 INFO  main ROOT:90 [||] batch_size  = 1000
2014-08-18 12:08:41,844 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60004
2014-08-18 12:08:47,324 INFO  main ROOT:144 [||] runtime     = 5.437497829 seconds
2014-08-18 12:08:47,325 INFO  main ROOT:145 [||] throughput  = 18390 ops/sec
2014-08-18 12:08:47,325 INFO  main ROOT:146 [||] throughput  = 2.2449721607052067 MB/sec
2014-08-18 12:08:47,325 INFO  main ROOT:148 [||] ack count   = 100
2014-08-18 12:08:47,331 INFO  main ROOT:86 [||] Starting the test for org.apache.zab.RollingLog
2014-08-18 12:08:47,331 INFO  main ROOT:87 [||] epoch       = 2
2014-08-18 12:08:47,332 INFO  main ROOT:88 [||] txn size    = 1024
2014-08-18 12:08:47,333 INFO  main ROOT:89 [||] num_ops     = 100000
2014-08-18 12:08:47,333 INFO  main ROOT:90 [||] batch_size  = 1000
2014-08-18 12:08:47,341 INFO  main o.a.zab.transport.NettyTransport:157 [||] Server started: messi-ThinkPad-X220:60005
2014-08-18 12:08:58,061 INFO  main ROOT:144 [||] runtime     = 10.668699816 seconds
2014-08-18 12:08:58,062 INFO  main ROOT:145 [||] throughput  = 9373 ops/sec
2014-08-18 12:08:58,063 INFO  main ROOT:146 [||] throughput  = 9.153528703989172 MB/sec
2014-08-18 12:08:58,063 INFO  main ROOT:148 [||] ack count   = 100
2014-08-18 12:08:58,080 INFO  main org.apache.zab.TestBase:68 [||] SUCCEEDED: testPerformance[1](org.apache.zab.SyncProposalProcessorTest)

I need to correct what I said before. For txnSize is 128, and rolling size equals 10MB, there're 2 log files. For txnSize is 1024, there're 10 log files.

I think the latency is from rolling the log, every time we roll the log is equivalent to do a fsync, or more expensive.


EDIT rolling size equals 10MB