When you are dealing with a large number of duplicate items the recursive nature of the alg goes into an infinite loop. Instead of getting the duplicates distributed evenly at a level they are all allocated to the first item in the matrix.
The current fix is not very efficient:
Line ~96 in nearest_search within matrix_distance.py
if scores.sum() < 0.0001 and len(scores) > 0:
# they are all practically the same
# we have to do this to prevent infinite recursion
# TODO: would love an alternative solution
arg_index = np.random.choice(len(scores), k, replace=False)
else:
arg_index = np.argsort(scores)[:k]
When you are dealing with a large number of duplicate items the recursive nature of the alg goes into an infinite loop. Instead of getting the duplicates distributed evenly at a level they are all allocated to the first item in the matrix.
The current fix is not very efficient:
Line ~96 in nearest_search within matrix_distance.py