gwAdvNet2015 / adv-net-samples

Useful C code for networking, data structures, and more
MIT License
7 stars 19 forks source link

Issue #47 lossy counting #53

Closed hooohu closed 9 years ago

hooohu commented 9 years ago

I'm working on this issue.

Issue #47

twood02 commented 9 years ago

I don't think there is much benefit from having your program act as a client/server---for now your goal should just be to measure the maximum performance of the lossy counter. If you have data being sent back and forth, that overhead will be included, so it will be difficult to tell the true performance. So you can just remove all of the socket related stuff and just have a single program that generates keys and inserts them into its own counter.

(I added a few comments to the code but then removed them once I realized what you were doing--if you got emails about them you can ignore them)

hooohu commented 9 years ago

The lossy counting algorithm is tested in my own computer for validating the program.

The test results are highly related to the alpha parameter of Zipf. It shows that when the distribution become uniform, the algorithm become slow. And when the alpha become higher, the algorithm runs much faster.

In the typical setting of alpha = 0.5, the counting speed in my computer is around 70000 counts per second.

Suppose we use a 10Gbps network interface, the maximum data is 1.25 Gbytes/s. If each packet includes 1250 bytes, we may have 1000000 keys to count in one second. If my program is right, it may be hard for a single machine to handle that counting task.

zhangwei1984 commented 9 years ago

LC_Output might be a bug. In if branch, lose the bracket. If the signal handler does the LC_Output, what will the problem be(In countermerge() middle, the result is totally wrong, or just not accurate?)? It is better that also makes s, error as the command line parameter.

twood02 commented 9 years ago

@hantasm - before I merge this I'd like you to make a few small changes.

First, add an option so that instead of having a timer interrupt that determines how long it runs for, you can also have the user enter a number of requests to perform. Then they could run something like ./lossy_test -n 10000000 -n 1000 -p 0.01 -a 1.5 -r 1 to have it run 10 million requests and then stop, rather than running for a set time interval.

Next, make it so that if you use -r=0, it will only calculate and print out the counter list once at the very end of the program before quitting (if -v is also specified).

Otherwise this looks good!

hooohu commented 9 years ago

This program is modified for running specified loops. The loop parameter is -l [loops].

Sample Output: ./lossy_test -l 1000000 -n 1000 -p 0.01 -a 0.5 -r 0 -v Item Count 1 6170
2 1172
3 8
8 3
16 2
32 1
49 1
56 1
332 1
344 1
Lossy speed tester parameters: Lossy phi : 0.010000 zipf N : 1000 zipf alpha : 0.500000 running time : 13.042656 output period : 0 total count: 1000000 counts per sec: 76671.498862

twood02 commented 9 years ago

excellent!