Closed hooohu closed 9 years ago
I don't think there is much benefit from having your program act as a client/server---for now your goal should just be to measure the maximum performance of the lossy counter. If you have data being sent back and forth, that overhead will be included, so it will be difficult to tell the true performance. So you can just remove all of the socket related stuff and just have a single program that generates keys and inserts them into its own counter.
(I added a few comments to the code but then removed them once I realized what you were doing--if you got emails about them you can ignore them)
The lossy counting algorithm is tested in my own computer for validating the program.
The test results are highly related to the alpha parameter of Zipf. It shows that when the distribution become uniform, the algorithm become slow. And when the alpha become higher, the algorithm runs much faster.
In the typical setting of alpha = 0.5, the counting speed in my computer is around 70000 counts per second.
Suppose we use a 10Gbps network interface, the maximum data is 1.25 Gbytes/s. If each packet includes 1250 bytes, we may have 1000000 keys to count in one second. If my program is right, it may be hard for a single machine to handle that counting task.
LC_Output might be a bug. In if branch, lose the bracket. If the signal handler does the LC_Output, what will the problem be(In countermerge() middle, the result is totally wrong, or just not accurate?)? It is better that also makes s, error as the command line parameter.
@hantasm - before I merge this I'd like you to make a few small changes.
First, add an option so that instead of having a timer interrupt that determines how long it runs for, you can also have the user enter a number of requests to perform. Then they could run something like ./lossy_test -n 10000000 -n 1000 -p 0.01 -a 1.5 -r 1
to have it run 10 million requests and then stop, rather than running for a set time interval.
Next, make it so that if you use -r=0
, it will only calculate and print out the counter list once at the very end of the program before quitting (if -v is also specified).
Otherwise this looks good!
This program is modified for running specified loops. The loop parameter is -l [loops].
Sample Output:
./lossy_test -l 1000000 -n 1000 -p 0.01 -a 0.5 -r 0 -v
Item Count
1 6170
2 1172
3 8
8 3
16 2
32 1
49 1
56 1
332 1
344 1
Lossy speed tester parameters:
Lossy phi : 0.010000
zipf N : 1000
zipf alpha : 0.500000
running time : 13.042656
output period : 0
total count: 1000000
counts per sec: 76671.498862
excellent!
I'm working on this issue.
Issue #47