Closed amirouche closed 11 months ago
Thanks. Discussion on the PR.
Give there is low number of points at the highest level, randomizing at every query could help. Another kind index for the highest layer may be even more interesting in some use-cases.
I am more interested in bigger than memory indices.
Closing this issue.
I am getting into vector space, so take what I write with a grain of salt 👋🏽
There is no fixed entry / enter point in HNSW, not even at the highest layer / level:
I can't find the algorithm in the paper that documents how to insert the first point. Tho, the insert algorithm reads that the associated level $l$ where it appears first in the HNSW layers is not necessarily the highest level $L$:
Quoting HNSW:
The last line is what triggered me to fill the issue.
https://github.com/ekzhu/datasketch/blob/c98c145417965d015103f2a1bfcf59658e90b470/datasketch/hnsw.py#L533-L537
The last line
self._entrypoint = key
can be done outside thefor
but see my previous comments.