manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
9.1k stars 510 forks source link

Hybrid search #2079

Open mtruyens opened 7 months ago

mtruyens commented 7 months ago

In your interesting blog post https://manticoresearch.com/blog/vector-search-in-databases/ you referred to Reciprocal Rank Fusion (RRF).

It would be really great to implement this, so that we could re-rank vector-based(KNN) searches with sparse-based searches!

sanikolaev commented 1 month ago

After discussing with the team, here's the syntax we've come up with:

SQL:

select ... where hybrid(
  knn(...), 
  match(...), 
  {
    vector_weight=0.5, 
    fusion_method=rrf
  }
)

(the options are optional)

or

select ... where hybrid(
  knn(...), 
  match(...), 
  0.5 as vector_weight,
  'rrf' as fusion_method
)

depending on what's easier in terms of parsing and avoiding conflicts with other statements.

JSON (example):

POST .../search
{
    ...
    "knn":
    {
        "field": "image_vector",
        "query_vector": [0.286569,-0.031816,0.066684,0.032926],
        "k": 5,
        "full-text": {
          <full-text query goes here>
        },
        "filter":
        {
            "bool":
            {
                "must":
                [
                    { "match": {"_all":"white"} },
                    { "range": { "id": { "lt": 10 } } }
                ]
            }
        }
    }
    ...
}
' 

Notes: