High performance impact of data retrieval

Hi, I'm using odes 1.12.0 with ES 7.10.0. I have a question regarding performance benchmarking. I have a cosine similarity index with 4 million documents and 100D vectors (approximate search). Parameters are: M=48, ef_search=1024, ef_construction=1024. I followed all advice for performance tuning: merge to one segment, retrieve no fields in query, warmup. With this Im getting around 100ms for a query for k=10.000. This is quite good I think and more than 40x faster than exact indexing.

Fast query:

{"stored_fields": "_none_",
"docvalue_fields": "[_id]",
"size": 10000,
  "query": {
    "knn": {
      "sem_vector": {
        "vector": query_vec,
        "k": 10000
      }
    }
  }
}

However I want to use ES also as data store in my use case. So I want to retrieve more fields from the data, not just id's. As I add only one field to be retrieved to the search request query time drops to 2.5 sec, so 25x slower. Do you have any idea how to avoid it? The fields I retrieve are text fields, integer and date. Slow query:

{"_source": "required_field",
"size": 10000,
  "query": {
    "knn": {
      "sem_vector": {
        "vector": query_vec,
        "k": 10000
      }
    }
  }
}

opendistro-for-elasticsearch / k-NN

High performance impact of data retrieval #338