HNSW Support for Vector Search

loretoparisi commented 1 year ago

Is your feature request related to a problem? Please describe. Indexing vectors of embeddings along with the document. Optionally supporto multi-vector per document and retrieval.

Describe the solution you'd like Add HNSW as vector similarity search

Describe alternatives you've considered

OpenSearch >= 8
Vespa
Vector Stores (Pinecone, ChromaDB, etc.)

Additional context Semantic and Similarity Search integration to keyword based search.

sanikolaev commented 10 months ago

The following SQL syntax is proposed for the new field:

<field name> 
  float_vector 
    [knn_type='hnsw'
      knn_dims='int'
      knn_similarity={l2|ip|cosine}
      [hnsw_m='int']
      [hnsw_ef_construction='int']
    ]

knn_type is not mandatory. If no knn* is specified, the field remains just an array of floats
knn_type gets turns on automatically if knn_similarity or knn_dims is specified. The default is hnsw.
knn_dims and knn_similarity are required if knn_type='hnsw'
hnsw_m and hnsw_ef_construction are optional

Examples:

create table t(a float_vector) - just an array of floats
create table t(a float_vector knn_dims='128' knn_similarity='l2') - simplest syntax to enable knn
create table t(a float_vector knn_type='hnsw' knn_dims='128' knn_similarity='l2') - alternative syntax mostly for the future when knn_type can be e.g. annoy
create table t(a float_vector knn_type='hnsw' knn_dims='16' knn_similarity='ip' hnsw_m='16') - fine-tuning
create table t(a float_vector knn_type='hnsw' knn_dims='16' knn_similarity='ip' hnsw_m='20' hnsw_ef_construction='90') - more fine-tuning

@glookka pls review and let me know if it looks good or if I'm missing something and there are better options.

glookka commented 10 months ago

knn_similarity={l2|ip|cosine} option is specific to HNSW. E.g. annoy has "angular", "euclidean", "manhattan", "hamming", or "dot". So it probably makes sense to name the option hnsw_similarity.

sanikolaev commented 9 months ago

Closing as done - https://manual.manticoresearch.com/dev/Searching/KNN#K-nearest-neighbor-search.

manticoresoftware / manticoresearch

HNSW Support for Vector Search #1415