Closed rush4ratio closed 2 years ago
Can you double check if the id and query id have the same type?
I am pretty sure this scenario is covered by a unit test, and you can for instance also see it in action in this google colab notebook
Apologies for not getting back to you sooner: It appears, if I set queryIdentifierCol
to the ID column of interest, then I don't experience the problem of the self ID being included. Was this an additional purpose of queryIdentifierCol
?
Yes you need to set the query column or it wont work.. probably i should raise an error if you use excludeSelf without also providing that as it would always be an error
I agree that it's not obvious that the query column should be set for excludeSelf to work. If this requirement is not fulfilled, raising an error would help.
I'll see if i can add that as a safeguard tonight. and close this issue
When I set excludeSelf to true, it still shows the id's of self in the results. Below is a sample (converted to a pandas dataframe) from what I'm using to experiment:
From above, the id's are on the left while the right contains a list of tuples (id and distance as returned from hnswlib). You'll notice id's of self appearing in the results.