Closed akashsha1 closed 1 week ago
Benchmark was run using opensearch-benchmark with cohere dataset(768 dimensions). Her are some configuration details for indexing: { "target_index_name": "target_index", "target_field_name": "target_field", "target_index_body": "indices/faiss-index.json", "target_index_primary_shards": 4, "target_index_replica_shards": 1, "target_index_dimension": 768, "target_index_space_type": "innerproduct", "target_index_bulk_size": 100, "target_index_bulk_index_data_set_format": "hdf5", "target_index_bulk_index_data_set_path": "/mnt/nvme1/documents-1m.hdf5", "target_index_bulk_indexing_clients": 20, "target_index_max_num_segments": 1, "hnsw_ef_search": 256, "hnsw_ef_construction": 256 }
Her are some configuration details for search: { "target_index_name": "target_index", "target_field_name": "target_field", "query_k": 100, "query_body": { "docvalue_fields" : ["_id"], "stored_fields" : "none" }, "query_data_set_format": "hdf5", "query_data_set_path": "/mnt/nvme1/queries-1m-100k.hdf5", "query_count": 30000, "search_clients": 20 }
A forcemerge to reduce the number of max_num_segments to 1 is executed via the API before the seach.
The opensearch cluster was deployed with 2 data nodes (r7i.2xlarges), 1 replica and 4 shards. Using this setup AVX512 shows 15% improvement over AVX2 on indexing and 7 % on search as shown below:
Benchmark was run using opensearch-benchmark with cohere dataset(768 dimensions). The opensearch cluster was deployed with 2 data nodes (r7i.2xlarges), 1 replica and 4 shards. Using this setup AVX512 shows 15% improvement over AVX2 on indexing and 7 % on search as shown below:
@assanedi Can you also pls add other configuration details like the indexing clients, query clients, ef_construction, ef_search, etc
"target_index_bulk_indexing_clients": 20, "target_index_max_num_segments": 10, "hnsw_ef_search": 256, "hnsw_ef_construction": 256
@assanedi Isn't the max_num_segments was 1 during forcemerge ?
"target_index_bulk_indexing_clients": 20, "target_index_max_num_segments": 10, "hnsw_ef_search": 256, "hnsw_ef_construction": 256
@assanedi Isn't the max_num_segments was 1 during forcemerge ?
Yes I run the forcemerge API, here is the results of it: curl -X POST -k --user admin:admin http://10.0.0.80:9200/_forcemerge?max_num_segments=1 {"_shards":{"total":8,"successful":8,"failed":0}}
Yes I run the forcemerge API, here is the results of it: curl -X POST -k --user admin:admin http://10.0.0.80:9200/_forcemerge?max_num_segments=1 {"_shards":{"total":8,"successful":8,"failed":0}}
Yes, but in the configuration you mentioned it as 10 instead of 1 for target_index_max_num_segments
For FP32 we don’t need to make any changes in Faiss as they are using auto-vectorization to achieve the optimization with AVX512. But, for Scalar Quantization Intel have raised a PR to Faiss which is under review https://github.com/facebookresearch/faiss/pull/3853
Yes I run the forcemerge API, here is the results of it: curl -X POST -k --user admin:admin http://10.0.0.80:9200/_forcemerge?max_num_segments=1 {"_shards":{"total":8,"successful":8,"failed":0}}
Yes, but in the configuration you mentioned it as 10 instead of 1 for
target_index_max_num_segments
I updated the configuration details
Description
This change adds support to speed up vector search and indexing in faiss using AVX512 hardware accelerator.
Related Issues
Resolves #2056
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.