Closed Mewral closed 1 year ago
What did you mean different result
? Are you using /query
endpoint to find the most relevant item?
yes. I'm using /query to find the most relevant item, But I get different topk relevant results everytime I reload my database with the same query
I believe this could be reasonable given the non-deterministic nature of the underlying HNSW algorithm.
Can u provide a small example showing the specific problem?
Yes. I think the HNSW is the reason, But I haven't proved it yet. I'll post some example tomorrow, thanks for the reply
So basically my annlite query endpoint gives top10 relevant documents index like [1, 2, 3 ,4 ,5 ,6, 7, 8, 9, 10] .After I stop the flow and reboot it, It gives top10 relevant documents index like [1, 2, 3, 4, 12, 6, 7, 8, 9,10] and I didn't change any config. Perhaps It has something to do with my bootup. Log says I'm building annlite index from scratch every time I run my code. I do saw a snapshot_path to load previous state index, but I don't know how to set it.
HNSW will be rebuilt every time if you didn't dump the index, so the result might be slight different. Could you try dump the indexer and then load and search?
Yes I'll try, thx. Still I have one more question. Assuming that Query-A has top3 most relevant documents [1,2,3] and Query-B has top3 most relevant documents [5,6,7], I run Query-A with correct result [1,2,3] and Query-B with wrong result [5,8,7] and after reload annlite without dumped indexer I get Query-A with wrong result [1, 4, 3] Query-B with correct result [5, 6, 7]. They both change. So I haven't found a way to make them both right. Is this also a HNSW feature?
NVM, I check the documents and solved it . closing it now
Hi, I'm currently using annlite indexer in a jina flow to do text embedding search. Every time I start flow the annlite indexer give me different result with the same data. Any idea about the problem?