elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.52k stars 24.6k forks source link

Support for dimension shrinking in dense vector fields #111745

Open jimczi opened 1 month ago

jimczi commented 1 month ago

Description

Currently, if the number of dimensions in a dense vector field doesn't match the input vector, an error is thrown. It would be helpful to support automatic dimension shrinking for models trained with Matryoshka (which allows dimension flexibility), without requiring this to be done offline. This feature would be particularly useful in scenarios like inference API usage, where models, such as those provided by OpenAI, offer flexible dimensions for indexing.

For example, if a model outputs 1024 dimensions, users could define a mapping like this:

PUT my-index
{
  "mappings": {
    "properties": {
      "emb_short": {
        "type": "dense_vector",
         // using the first 384 dimensions
        "dims": 384,
        "copy_to": "emb_full" 
      },
      "emb_full" : {
        "type" : "dense_vector",
        "dims": 1024,
        "index_options": {
          "type": "int8_hnsw"
        }
      }
    }
  }
}

Then at query time, the emb_short field can be used for approximate nearest neighbour search through the HNSW index and rescore the results using the emb_full field to increase the recall.

The downside is that it would now be impossible to determine if the number of dimensions was misconfigured for the emb_short field in this example. However we could continue to throw an error if the input vector has less dimensions that the value configured in the mapping.

At query time, the emb_full field can be used for approximate nearest neighbor search through the HNSW index, and the results can be rescored using the emb_full field to improve recall.

The potential downside is that it would no longer be possible to detect if the number of dimensions was misconfigured for the emb_short field in this scenario. However, we could still throw an error if the input vector has fewer dimensions than the value configured in the mapping.

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent commented 1 month ago

@jimczi something else to help with this would be:

The idea behind "slice" is that Matryoshka Embeddings would be interesting to use in the following manner:

PUT my-index
{
  "mappings": {
    "properties": {
      "emb_head": {
         "type": "dense_vector",
         "slice": {"from": 0, "to": 384}
         "fields": {
            "emb_tail": {
               "type": "dense_vector",
               "slice": {"from": 384, "to": 1024},
               "index": false
            }
         }
      }
    }
  }
}

Since dot-product values can just be summed, you end up with:

POST my-index/_search
{
  "query": {"knn": "emb_head", "query_vector": [...]},
   "rescore" : {
      "window_size" : 50,
      "query" : {
         "rescore_query" : {
            "script_score": {"query" : {"match_all": {}},
              "script": {
                "source": """
                  double value = dotProduct(params.query_vector, 'emb_head.emb_tail');
                  return sigmoid(1, Math.E, -value); 
                """,
                "params": {
                  "query_vector": [...]
                }
              }
            }
         },
         "query_weight" : 1.0,
         "rescore_query_weight" : 1.0,
         "score_mode": "sum"
      }
   }
}

So, you have the whole vector in _source just once, and do a knn query over the first piece and just sum it with the tail since doc-product is just the summation of dimensional products

jimczi commented 1 month ago

Multi-field dense vectors seems like a much better fit for this use case, @benwtrent! We should definitely add the support if we allow dimension shrinking/slicing. I like the slice idea too, that's more flexible at the expense of a new parameter in the mapping.