einot / gross

Automatically exported from code.google.com/p/gross
Other
1 stars 2 forks source link

grossd dies when filter_bit > 24 #78

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

I'm trying to run grossd with the following configuration (I have a
high-volume site, and with the default configuration I have a lot of false
positives):

filter_bits = 26
number_buffers = 384
rotate_interval = 900

But after a while, grossd crash.

This happened both on Linux (rhel 4) and Freebsd 7, and both with version
1.0.1 and SVN version from today.

Original issue reported on code.google.com by imriz...@gmail.com on 1 Dec 2008 at 10:22

GoogleCodeExporter commented 9 years ago
The values you are trying to use make no sense. Try something like 

number_buffers=32
rotate_interval=10800

With those values you get the same retention as with the values you submitted.

Original comment by eino.tuominen@gmail.com on 1 Dec 2008 at 11:01

GoogleCodeExporter commented 9 years ago
The problem is that I get false positives, a lot of them.

Original comment by imriz...@gmail.com on 1 Dec 2008 at 11:15

GoogleCodeExporter commented 9 years ago
Correct me if i'm wrong, but the configuration you've suggested means that I 
will
have less buffers (i.e. less "space" for triplets), but each buffer will stay in
memory for longer time. Therefore, if I have a lot of triplets, they will just 
fill
up the buffer,.

Original comment by imriz...@gmail.com on 1 Dec 2008 at 11:18

GoogleCodeExporter commented 9 years ago
Well, you are wrong. ;-) Collision probability is a function of retention time, 
filter size and query frequency. 
Retention time is the factor of number_buffers and rotate_interval.

Join the mailing list, and ask help there, thanks. 

Original comment by eino.tuominen@gmail.com on 1 Dec 2008 at 11:23

GoogleCodeExporter commented 9 years ago

Original comment by eino.tuominen@gmail.com on 19 Dec 2008 at 12:11

GoogleCodeExporter commented 9 years ago
Just for the record [from a message on gross mailing list]:

I found and fixed the bug. There was some parentheses missing which caused and 
integer overflow which 
caused grossd to reserve too little memory for filters. Fixed version in the 
svn gross-1.0 branch.

The problem was triggered iff

(number_buffers + 1) * 2^filter_bits > 231 

Original comment by eino.tuominen@gmail.com on 18 Sep 2009 at 6:26