Improving benchmark in particular with respect to #16

I am merging two benchmarks. More specifically:

1) I fix parameter settings for NMSLIB 2) Instead of a single limit parameter I make two separate ones: test_size and test_data 3) I try to consistently include all the parameters into the names of caches and indices. 4) I correctly attribute the script as based on ann_benchmarks code. It also seems that SIMD distance computation is based on NMSLIB code, but no attribution was given in the code.

Unfortunately Annoy fails for youtube with a weird error:

INFO:n2_benchmark:(998000, 128) (2000, 128)
INFO:n2_benchmark:Built index in 0.06020975112915039
terminate called after throwing an instance of 'std::length_error'
  what():  vector::_M_range_insert

Sorry, I have no time to look into this. I hope you can find the problem.

I ran benchmarks on my machine. For youtube I had to limit the size of the data set to 5M data points, but I don't think it makes a big difference. While it takes 25-30% shorter time to build the index, I see virtually no difference in query times. From some engineering perspective, n2 is certainly better (e.g., it has a clearer implementation and is lightweight). However, it doesn't seem to be better in terms of query times. Don't forget that you cannot show people a single data point (as it's done on your main page). You have to show complete recall/efficiency curves.

Thanks!

sift-5000000-2000-3.pdf youtube-5000000-2000-3.pdf glove-5000000-2000-3.pdf

kakao / n2

Improving benchmark in particular with respect to #16 #20