In approximate search of coconut.c and coconut_plus.c, the codes traverse the nearest records (the number is equal to leaf node size) and find the best. If the dataset is static, this method is acceptable.
But notice that if new insertion comes (it is normal in the actual situation), the full node will split into 2 new leaf nodes in B+-tree and the leaf nodes in B+-tree are not sequential on disk (actually they need a disk pointer to connect each other), and then some nodes are not full (at least half full). In this situation, the minimal distance/ recall rate, etc will be worse than now.
In approximate search of coconut.c and coconut_plus.c, the codes traverse the nearest records (the number is equal to leaf node size) and find the best. If the dataset is static, this method is acceptable. But notice that if new insertion comes (it is normal in the actual situation), the full node will split into 2 new leaf nodes in B+-tree and the leaf nodes in B+-tree are not sequential on disk (actually they need a disk pointer to connect each other), and then some nodes are not full (at least half full). In this situation, the minimal distance/ recall rate, etc will be worse than now.
Waiting for your replies. :)