Open eoriont opened 10 months ago
Here is a link https://escholarship.org/content/qt1rp889r9/qt1rp889r9_noSplash_7071690a1d8a4ee71eb95432887d3a8e.pdf to a paper that contains various optimizations for the HNSW algorithm. We can look at this for inspiration.
Here is a list of potential optimizations to add for the Hierarchical Navigable Small Worlds Algorithm, optimizations on the base algorithm.
Parallel Computing: The HNSW algorithm can be optimized by implementing parallel computing, as shown in this thesis paper here. It would be hard however to explain this and talk about it in a mathematical manner.
Distance Metric: We could improve the HNSW algorithm by using different distance metrics than just Euclidean Distance. We could explore different distance metrics like Manhattan distance, Cosine distance, chebyshev distance, minkowski distance, and compare the runtimes between them. This is probably a good change/optimization to explore on the HNSW algorithm, because we can use math to easily explain each of these metrics. Here is a blog post that gives you a basic explanation of each of these.
Here is another distance optimization, I highly doubt that we would actually be able to implement this, because it appears to be very complicated and we can't implement it by just swapping out the previous distance function. We can however talk about it in future extensions, like other potential optimizations to explore in the future. This distance optimization works by simply approximating the distance, which is fine because this is an approximate algorithm. It is discussed in this paper
https://esteininger.medium.com/building-a-vector-search-engine-using-hnsw-and-cosine-similarity-753fb5268839 Here is another resource I used, make sure to add it in the references section
Got partially through adding the Cosine Distance to the overleaf document. (line 111) Need to brush up on the other distance methods we thought about and have employed so far. Finished adding the references though.
As for advantages and disadvantages, are there any obvious ones that stood out when running those trials? Only thing I managed to see was the O(log(n)) time complexity, which doesn't seem specific to Cosine Distance. Not sure how checking how similar the vectors are in terms of angles would make the algorithm more accurate either, though I haven't been able to dedicate too much time on it.
Just to clarify: Cosine is Cosine and l2 is Euclidean, correct? If so, what is ip distance, and will we be using Manhattan distance?
Alright, ignore the previous comment. I got through explaining each one of them and their formulas, though I need to think about advantages and disadvantages. I got some help from this, so the author might have some useful stuff here too. https://medium.com/@junxie2/semantic-search-similarity-metrics-e215a0d65885
https://github.com/erikbern/ann-benchmarks
Here is a list of KNN algorithms, all benchmarked for us. We could probably cite these in the optimizations.
https://publications.hse.ru/mirror/pubs/share/folder/x5p6h7thif/direct/128296059 (HNSW paper) https://arxiv.org/pdf/1603.09320.pdf (another HNSW paper) https://github.com/brtholomy/hnsw (HNSW explanation) https://www.nature.com/articles/30918 (Small worlds) https://ethen8181.github.io/machine-learning/deep_learning/multi_label/nsw.html (NSW tutorial) https://fasttext.cc/docs/en/python-module.html#train_supervised-parameters (docs for ngram model used in nsw tut)
Quick question, where did you find out that Inner Product might have a closer vector than itself? Was it through just working with the HNSW implementation we have or was there a source to it?
https://www3.nd.edu/~dgalvin1/60610/60610_S09/60610graphnotation.pdf (graph notation)
Basically google some papers on HNSW optimizations, then copy the link and post them here w/ a short summary of them