Closed unmeshvrije closed 4 years ago
The optimal parameters depend on your dataset/use case. If you want perfect accuracy and storage space/search time is not an issue, you should use IndexFlatL2
alone.
Thank you @beauby for the answer. Here is what I am trying to do: I have Knowledge graph (KG) embeddings and I am using TransE method. For a KG of a university, a triple (student1, studies, ?) is a query for which I aim to predict the tail (head is student1 and relation is studies). TransE model assigns embeddings (a vector of N dimensions) to all entities and relations such that for a true triple (student1, studies, subject1) , student1 + studies is almost equal to subject1 where bold letters denote the embedding of the corresponding entity/relation.
I want to compare the performance of TransE with Approximate nearest neighbour and thus, want to use parameters that give the most accurate results. I was not sure whether changing centroids, nprobe would affect accuracy.
Please let me know if the use case is not clear yet
no activity, closing.
Greetings,
I am using the following code to build the indexes (the code inspired from tutorials) Are these values optimal for getting the best results (in terms of accuracy ?)