Open rjzhb opened 4 weeks ago
Hi! I don't expect the performance to be great for 1M points, at that scale you may not want to use indexing at all. But for clarity, are you running on 1 thread? What kind of machine are you using?
Hi! I don't expect the performance to be great for 1M points, at that scale you may not want to use indexing at all. But for clarity, are you running on 1 thread? What kind of machine are you using?
CPU: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 2.59 GHz Memory: 16.0 GiB OS: Ubuntu 22.04 LTS
I used single thread. Compared to single threading, I think multi threading has lower latency, but QPS will decrease
And I think it's unreasonable to say that 1 million is not enough. For HNSW, even with 1 million, the number of calculations is already several thousand.
Did you use 1 thread in both cases, for hnswlib and usearch? Did you use identical expansion factors for both engines? Sadly, the link seems broken and I can’t see what’s there.
Did you use 1 thread in both cases, for hnswlib and usearch? Did you use identical expansion factors for both engines? Sadly, the link seems broken and I can’t see what’s there.您是否在这两种情况下都使用了 1 个线程,用于 hnswlib 和 usearch?您是否为两个引擎使用了相同的膨胀系数?可悲的是,链接似乎断开了,我看不到那里有什么。
Of course, you can see my code:
#include "base_profiler.h"
#include "helper.h"
#include "index_dense.hpp"
#include "index_plugins.hpp"
#include <fstream>
#include <iostream>
#include <cstdlib>
#include <iomanip>
#include <unordered_set>
static const char *sift1M_base = "/benchmark_dataset/sift1M/sift_base.fvecs";
static const char *sift1M_query = "/benchmark_dataset/sift1M/sift_query.fvecs";
static const char *sift1M_groundtruth = "/benchmark_dataset/sift1M/sift_groundtruth.ivecs";
static const char *deep10M_base = "/benchmark_dataset/deep10M/deep10M_base.fvecs";
static const char *deep10M_query = "/benchmark_dataset/deep10M/deep10M_query.fvecs";
static const char *deep10M_groundtruth = "/benchmark_dataset/deep10M/deep10M_groundtruth.ivecs";
static const char *hnsw_index_l2_name_prefix = "/benchmark_save_index/usearch_index/";
static const char *hnsw_index_l2_name_suffix = "/usearch_index.save";
template<typename T>
void output_recall_top_k(const unsigned top_k, std::vector<std::vector<unum::usearch::index_dense_t::vector_key_t>> &results_pq,
const std::vector<std::unordered_set<T>> &groundtruth_1,
const std::vector<std::unordered_set<T>> &groundtruth_10,
const std::vector<std::unordered_set<T>> &groundtruth_100) {
if (top_k != 100) {
std::cerr << "top_k can only be 100" << std::endl;
exit(-1);
}
size_t gt_cnt_1 = 0, gt_cnt_10 = 0, gt_cnt_100 = 0;
std::vector<std::vector<T>> results(results_pq.size());
for (unsigned i = 0; i < results_pq.size(); i++) {
for (unsigned j = 0; j < results_pq[i].size(); j++) {
results[i].push_back(results_pq[i][j]);
}
}
for (unsigned i = 0; i < results.size(); i++) {
for (size_t cnt = 0; cnt < 100; ++cnt) {
auto id = results[i][cnt];
if (cnt < 1 && groundtruth_1[i].contains(id)) {
++gt_cnt_1;
}
if (cnt < 10 && groundtruth_10[i].contains(id)) {
++gt_cnt_10;
}
if (cnt < 100 && groundtruth_100[i].contains(id)) {
++gt_cnt_100;
}
}
}
std::cout << "R@1: " << std::fixed << std::setprecision(3) << float(gt_cnt_1) / float(results.size() * 1)
<< std::endl;
std::cout << "R@10: " << std::fixed << std::setprecision(3) << float(gt_cnt_10) / float(results.size() * 10)
<< std::endl;
std::cout << "R@100: " << std::fixed << std::setprecision(3) << float(gt_cnt_100) / float(results.size() * 100)
<< std::endl;
}
template<typename T>
std::unique_ptr<T[]> load_data(const std::string &filename, size_t &num,
int &dim) { // load data with sift10K pattern
std::ifstream in(filename, std::ios::binary);
if (!in.is_open()) {
std::cout << "open file error" << std::endl;
exit(-1);
}
in.read((char *)&dim, 4);
in.seekg(0, std::ios::end);
auto ss = in.tellg();
num = ((size_t)ss) / (dim + 1) / 4;
auto data = std::make_unique<T[]>(num * dim);
in.seekg(0, std::ios::beg);
for (size_t i = 0; i < num; i++) {
in.seekg(4, std::ios::cur);
in.read((char *)(data.get() + i * dim), dim * 4);
}
in.close();
return data;
}
int main() {
const char *homeDir = getenv("HOME");
std::string base_file = homeDir;
std::string query_file = homeDir;
std::string groundtruth_file = homeDir;
std::string data_type;
size_t dimension = {};
size_t base_counts = {};
size_t M = 16;
size_t ef_construction = 200;
int ef_search_begin = 100;
int ef_search_end = 800;
int ef_search_step = 100;
// let user choose to test sift1M or deep10M
enum test_type {
invalid = 0, sift1M = 1, deep10M = 2
};
std::cout << "please choose to test sift1M or deep10M, input 1 or 2" << std::endl;
std::cout << "1. sift1M" << std::endl;
std::cout << "2. deep10M" << std::endl;
int choose = 0;
std::cin >> choose;
switch (choose) {
case test_type::sift1M: {
data_type = "sift1M";
dimension = 128;
base_counts = 1'000'000;
base_file += sift1M_base;
query_file += sift1M_query;
groundtruth_file += sift1M_groundtruth;
break;
}
case test_type::deep10M: {
data_type = "deep10M";
dimension = 96;
base_counts = 10'000'000;
base_file += deep10M_base;
query_file += deep10M_query;
groundtruth_file += deep10M_groundtruth;
break;
}
default:std::cout << "invalid input, exit." << std::endl;
return -1;
}
// let user choose M, ef_construction, ef_search
std::cout
<< "please input M, ef_construction, ef_search_begin, ef_search_end, ef_search_step. input 0 to use default values"
<< std::endl;
int M_input = 0, ef_construction_input = 0, ef_search_begin_input = 0, ef_search_end_input = 0,
ef_search_step_input = 0;
std::cin >> M_input >> ef_construction_input >> ef_search_begin_input >> ef_search_end_input
>> ef_search_step_input;
if (M_input > 0) {
M = M_input;
}
if (ef_construction_input > 0) {
ef_construction = ef_construction_input;
}
if (ef_search_begin_input > 0) {
ef_search_begin = ef_search_begin_input;
}
if (ef_search_end_input > 0) {
ef_search_end = ef_search_end_input;
}
if (ef_search_step_input > 0) {
ef_search_step = ef_search_step_input;
}
std::string build_parameter = ".M." + std::to_string(M) + ".ef_construction." + std::to_string(ef_construction);
std::string hnsw_index_l2_name = std::string(homeDir) + std::string(hnsw_index_l2_name_prefix) + data_type +
std::string(hnsw_index_l2_name_suffix) + build_parameter;
unum::usearch::index_dense_config_t config;
config.expansion_add = ef_construction;
config.connectivity_base = M;
unum::usearch::metric_punned_t
metric(dimension, unum::usearch::metric_kind_t::l2sq_k, unum::usearch::scalar_kind_t::f32_k);
unum::usearch::index_dense_t usearch_index = unum::usearch::index_dense_t::make(metric, config);
std::ifstream f(hnsw_index_l2_name);
if (f.good()) {
// Found index file
std::cout << "Found index file ... " << std::endl;
auto result = usearch_index.load(hnsw_index_l2_name.c_str());
if (!result) {
std::cout << "Failed to load index. Error." << "\n";
exit(1);
}
} else {
BaseProfiler profiler;
{
size_t embedding_count;
std::unique_ptr<float[]> input_embeddings;
{
int dim;
BaseProfiler profiler_in;
profiler_in.Begin();
input_embeddings = load_data<float>(base_file, embedding_count, dim);
profiler_in.End();
assert(dimension == dim || !"embedding dimension wrong");
assert(base_counts == embedding_count || !"embedding count wrong");
std::cout << "Load base data: " << profiler_in.ElapsedToString() << std::endl;
//output embedding_count, dimension
std::cout << "embedding_count: " << embedding_count << std::endl;
std::cout << "dimension: " << dim << std::endl;
}
usearch_index.reserve(embedding_count);
profiler.Begin();
// insert data into index
for (size_t idx = 0; idx < embedding_count; ++idx) {
usearch_index.add(idx, input_embeddings.get() + idx * dimension);
if (idx % 10000 == 0) {
std::cout << idx << ", " << get_current_rss() / (1024 * 1024) << " MB, "
<< profiler.ElapsedToString()
<< std::endl;
}
}
profiler.End();
std::cout << "before clearing input vector, memory cost: "
<< get_current_rss() / (1024 * 1024) << " MiB" << std::endl;
}
std::cout << "Insert data cost: " << profiler.ElapsedToString() << " memory cost: "
<< get_current_rss() / (1024 * 1024) << " MiB" << std::endl;
usearch_index.save(hnsw_index_l2_name.c_str());
}
size_t number_of_queries;
std::unique_ptr<float[]> queries;
{
int dim;
BaseProfiler profiler;
profiler.Begin();
queries = load_data<float>(query_file, number_of_queries, dim);
profiler.End();
assert(dimension == dim || !"query does not have same dimension as train set");
std::cout << "Load query data: " << profiler.ElapsedToString() << std::endl;
}
int top_k; // nb of results per query in the GT
std::vector<std::unordered_set<int>> ground_truth_sets_1, ground_truth_sets_10,
ground_truth_sets_100; // number_of_queries * top_k matrix of ground-truth nearest-neighbors
{
// load ground-truth and convert int to long
size_t nq2;
BaseProfiler profiler;
profiler.Begin();
auto gt_int = load_data<int>(groundtruth_file, nq2, top_k);
assert(nq2 == number_of_queries || !"incorrect nb of ground truth entries");
ground_truth_sets_1.resize(number_of_queries);
ground_truth_sets_10.resize(number_of_queries);
ground_truth_sets_100.resize(number_of_queries);
for (size_t i = 0; i < number_of_queries; ++i) {
for (int j = 0; j < top_k; ++j) {
auto gt = gt_int[i * top_k + j];
if (j < 1) {
ground_truth_sets_1[i].insert(gt);
}
if (j < 10) {
ground_truth_sets_10[i].insert(gt);
}
if (j < 100) {
ground_truth_sets_100[i].insert(gt);
}
}
}
profiler.End();
std::cout << "Load ground truth: " << profiler.ElapsedToString() << std::endl;
}
for (int ef = ef_search_begin; ef <= ef_search_end; ef += ef_search_step) {
usearch_index.change_expansion_search(ef);
BaseProfiler profiler;
profiler.Begin();
auto t0 = std::chrono::high_resolution_clock::now();
std::vector<std::vector<unum::usearch::index_dense_t::vector_key_t>> results;
results.resize(number_of_queries);
for (size_t id = 0; id < number_of_queries; ++id) {
auto result = usearch_index.search(queries.get() + id * dimension, top_k);
results[id].resize(top_k);
result.dump_to(results[id].data());
}
auto t_val = std::chrono::duration_cast<std::chrono::duration<double>>(
std::chrono::high_resolution_clock::now() - t0).count();
profiler.End();
std::cout << "\nef_search = " << ef << ", Spend: " << profiler.ElapsedToString() << std::endl;
std::cout << "time = " << std::fixed << std::setprecision(3) << t_val << " s" << std::endl;
std::cout << "QPS = " << std::fixed << std::setprecision(3) << float(number_of_queries) / t_val
<< std::endl;
output_recall_top_k(top_k, results, ground_truth_sets_1, ground_truth_sets_10, ground_truth_sets_100);
}
return 0;
}
And this is hnswlib code, you will find that it is almost same as usearch code:
#include "base_profiler.h"
#include "helper.h"
#include "hnswlib/hnswlib.h"
#include <fstream>
#include <iostream>
#include <cstdlib>
static const char *sift1M_base = "/benchmark_dataset/sift1M/sift_base.fvecs";
static const char *sift1M_query = "/benchmark_dataset/sift1M/sift_query.fvecs";
static const char *sift1M_groundtruth = "/benchmark_dataset/sift1M/sift_groundtruth.ivecs";
static const char *deep10M_base = "/benchmark_dataset/deep10M/deep10M_base.fvecs";
static const char *deep10M_query = "/benchmark_dataset/deep10M/deep10M_query.fvecs";
static const char *deep10M_groundtruth = "/benchmark_dataset/deep10M/deep10M_groundtruth.ivecs";
static const char *hnsw_index_l2_name_prefix = "/benchmark_save_index/hnswlib_index/";
static const char *hnsw_index_l2_name_suffix = "/hnswlib_index.save";
template<typename T1, typename T2, typename T3>
void output_recall_top_k(const unsigned top_k, std::vector<std::priority_queue<std::pair<T1, T2>>> &results_pq,
const std::vector<std::unordered_set<T3>> &groundtruth_1,
const std::vector<std::unordered_set<T3>> &groundtruth_10,
const std::vector<std::unordered_set<T3>> &groundtruth_100) {
if (top_k != 100) {
std::cerr << "top_k can only be 100" << std::endl;
exit(-1);
}
size_t gt_cnt_1 = 0, gt_cnt_10 = 0, gt_cnt_100 = 0;
std::vector<std::vector<T3>> results(results_pq.size());
for (unsigned i = 0; i < results_pq.size(); i++) {
while (!results_pq[i].empty()) {
results[i].push_back(results_pq[i].top().second);
results_pq[i].pop();
}
std::reverse(results[i].begin(), results[i].end());
}
for (unsigned i = 0; i < results.size(); i++) {
for (size_t cnt = 0; cnt < 100; ++cnt) {
auto id = results[i][cnt];
if (cnt < 1 && groundtruth_1[i].contains(id)) {
++gt_cnt_1;
}
if (cnt < 10 && groundtruth_10[i].contains(id)) {
++gt_cnt_10;
}
if (cnt < 100 && groundtruth_100[i].contains(id)) {
++gt_cnt_100;
}
}
}
std::cout << "R@1: " << std::fixed << std::setprecision(3) << float(gt_cnt_1) / float(results.size() * 1)
<< std::endl;
std::cout << "R@10: " << std::fixed << std::setprecision(3) << float(gt_cnt_10) / float(results.size() * 10)
<< std::endl;
std::cout << "R@100: " << std::fixed << std::setprecision(3) << float(gt_cnt_100) / float(results.size() * 100)
<< std::endl;
}
template<typename T>
std::unique_ptr<T[]> load_data(const std::string &filename, size_t &num,
int &dim) { // load data with sift10K pattern
std::ifstream in(filename, std::ios::binary);
if (!in.is_open()) {
std::cout << "open file error" << std::endl;
exit(-1);
}
in.read((char *) &dim, 4);
in.seekg(0, std::ios::end);
auto ss = in.tellg();
num = ((size_t) ss) / (dim + 1) / 4;
auto data = std::make_unique<T[]>(num * dim);
in.seekg(0, std::ios::beg);
for (size_t i = 0; i < num; i++) {
in.seekg(4, std::ios::cur);
in.read((char *) (data.get() + i * dim), dim * 4);
}
in.close();
return data;
}
int main() {
const char *homeDir = getenv("HOME");
std::string base_file = homeDir;
std::string query_file = homeDir;
std::string groundtruth_file = homeDir;
std::string data_type;
size_t dimension = {};
size_t base_counts = {};
size_t M = 16;
size_t ef_construction = 200;
int ef_search_begin = 100;
int ef_search_end = 800;
int ef_search_step = 100;
// let user choose to test sift1M or deep10M
enum test_type {
invalid = 0, sift1M = 1, deep10M = 2
};
std::cout << "please choose to test sift1M or deep10M, input 1 or 2" << std::endl;
std::cout << "1. sift1M" << std::endl;
std::cout << "2. deep10M" << std::endl;
int choose = 0;
std::cin >> choose;
switch (choose) {
case test_type::sift1M: {
data_type = "sift1M";
dimension = 128;
base_counts = 1'000'000;
base_file += sift1M_base;
query_file += sift1M_query;
groundtruth_file += sift1M_groundtruth;
break;
}
case test_type::deep10M: {
data_type = "deep10M";
dimension = 96;
base_counts = 10'000'000;
base_file += deep10M_base;
query_file += deep10M_query;
groundtruth_file += deep10M_groundtruth;
break;
}
default:
std::cout << "invalid input, exit." << std::endl;
return -1;
}
// let user choose M, ef_construction, ef_search
std::cout
<< "please input M, ef_construction, ef_search_begin, ef_search_end, ef_search_step. input 0 to use default values"
<< std::endl;
int M_input = 0, ef_construction_input = 0, ef_search_begin_input = 0, ef_search_end_input = 0, ef_search_step_input = 0;
std::cin >> M_input >> ef_construction_input >> ef_search_begin_input >> ef_search_end_input
>> ef_search_step_input;
if (M_input > 0) {
M = M_input;
}
if (ef_construction_input > 0) {
ef_construction = ef_construction_input;
}
if (ef_search_begin_input > 0) {
ef_search_begin = ef_search_begin_input;
}
if (ef_search_end_input > 0) {
ef_search_end = ef_search_end_input;
}
if (ef_search_step_input > 0) {
ef_search_step = ef_search_step_input;
}
std::string build_parameter = ".M." + std::to_string(M) + ".ef_construction." + std::to_string(ef_construction);
std::string hnsw_index_l2_name = std::string(homeDir) + std::string(hnsw_index_l2_name_prefix) + data_type +
std::string(hnsw_index_l2_name_suffix) + build_parameter;
hnswlib::L2Space l2space(dimension);
std::unique_ptr<hnswlib::HierarchicalNSW<float>> hnsw_index;
std::ifstream f(hnsw_index_l2_name);
if (f.good()) {
// Found index file
std::cout << "Found index file ... " << std::endl;
hnsw_index = std::make_unique<hnswlib::HierarchicalNSW<float>>(&l2space, hnsw_index_l2_name);
} else {
BaseProfiler profiler;
{
size_t embedding_count;
std::unique_ptr<float[]> input_embeddings;
{
int dim;
BaseProfiler profiler_in;
profiler_in.Begin();
input_embeddings = load_data<float>(base_file, embedding_count, dim);
profiler_in.End();
assert(dimension == dim || !"embedding dimension wrong");
assert(base_counts == embedding_count || !"embedding count wrong");
std::cout << "Load base data: " << profiler_in.ElapsedToString() << std::endl;
//output embedding_count, dimension
std::cout << "embedding_count: " << embedding_count << std::endl;
std::cout << "dimension: " << dim << std::endl;
}
hnsw_index = std::make_unique<hnswlib::HierarchicalNSW<float>>(&l2space, embedding_count, M,
ef_construction);
profiler.Begin();
// insert data into index
for (size_t idx = 0; idx < embedding_count; ++idx) {
hnsw_index->addPoint(input_embeddings.get() + idx * dimension, idx);
if (idx % 10000 == 0) {
std::cout << idx << ", " << get_current_rss() / (1024 * 1024) << " MB, "
<< profiler.ElapsedToString()
<< std::endl;
}
}
profiler.End();
std::cout << "before clearing input vector, memory cost: "
<< get_current_rss() / (1024 * 1024) << " MiB" << std::endl;
}
std::cout << "Insert data cost: " << profiler.ElapsedToString() << " memory cost: "
<< get_current_rss() / (1024 * 1024) << " MiB" << std::endl;
hnsw_index->saveIndex(hnsw_index_l2_name);
}
size_t number_of_queries;
std::unique_ptr<float[]> queries;
{
int dim;
BaseProfiler profiler;
profiler.Begin();
queries = load_data<float>(query_file, number_of_queries, dim);
profiler.End();
assert(dimension == dim || !"query does not have same dimension as train set");
std::cout << "Load query data: " << profiler.ElapsedToString() << std::endl;
}
int top_k; // nb of results per query in the GT
std::vector<std::unordered_set<int>> ground_truth_sets_1, ground_truth_sets_10, ground_truth_sets_100; // number_of_queries * top_k matrix of ground-truth nearest-neighbors
{
// load ground-truth and convert int to long
size_t nq2;
BaseProfiler profiler;
profiler.Begin();
auto gt_int = load_data<int>(groundtruth_file, nq2, top_k);
assert(nq2 == number_of_queries || !"incorrect nb of ground truth entries");
ground_truth_sets_1.resize(number_of_queries);
ground_truth_sets_10.resize(number_of_queries);
ground_truth_sets_100.resize(number_of_queries);
for (size_t i = 0; i < number_of_queries; ++i) {
for (int j = 0; j < top_k; ++j) {
auto gt = gt_int[i * top_k + j];
if (j < 1) {
ground_truth_sets_1[i].insert(gt);
}
if (j < 10) {
ground_truth_sets_10[i].insert(gt);
}
if (j < 100) {
ground_truth_sets_100[i].insert(gt);
}
}
}
profiler.End();
std::cout << "Load ground truth: " << profiler.ElapsedToString() << std::endl;
}
for (int ef = ef_search_begin; ef <= ef_search_end; ef += ef_search_step) {
hnsw_index->setEf(ef);
BaseProfiler profiler;
profiler.Begin();
auto t0 = std::chrono::high_resolution_clock::now();
std::vector<std::priority_queue<std::pair<float, size_t>>> results;
results.reserve(number_of_queries);
for (size_t id = 0; id < number_of_queries; ++id) {
int corr = 0;
auto result = hnsw_index->searchKnn(queries.get() + id * dimension, top_k);
results.push_back(std::move(result));
}
auto t_val = std::chrono::duration_cast<std::chrono::duration<double>>(
std::chrono::high_resolution_clock::now() - t0).count();
profiler.End();
std::cout << "\nef_search = " << ef << ", Spend: " << profiler.ElapsedToString() << std::endl;
std::cout << "time = " << std::fixed << std::setprecision(3) << t_val << " s" << std::endl;
std::cout << "QPS = " << std::fixed << std::setprecision(3) << float(number_of_queries) / t_val
<< std::endl;
output_recall_top_k(top_k, results, ground_truth_sets_1, ground_truth_sets_10, ground_truth_sets_100);
}
return 0;
}
So is there any suggestion?
Thanks for sharing the code @rjzhb, it's very helpful!
Most of our users are at billion scale and beyond. Most of them don't use f32
and generally run 100+ threads, where concurrency and the cost of synchronization are higher priorities. Would be great if you have feedback or can contribute there!
So we don't actively optimize for the million-scale use-case and have a big ongoing release in prep - #402. We can come back to this later, once we done with the more urgent cases. Let's keep this open for now 🤗
Thanks for sharing the code @rjzhb, it's very helpful!感谢您分享代码,这很有帮助!
Most of our users are at billion scale and beyond. Most of them don't use
f32
and generally run 100+ threads, where concurrency and the cost of synchronization are higher priorities. Would be great if you have feedback or can contribute there!我们的大多数用户规模都在十亿甚至更高。他们中的大多数不使用f32
并且通常运行 100+ 线程,其中并发性和同步成本是更高的优先级。如果您有反馈或可以在那里做出贡献,那就太好了!So we don't actively optimize for the million-scale use-case and have a big ongoing release in prep - #402. We can come back to this later, once we done with the more urgent cases. Let's keep this open for now 🤗因此,我们不会针对百万级用例进行积极优化,而是在准备阶段进行大规模的持续发布 #402 。一旦我们处理了更紧急的案件,我们可以稍后再谈这个问题。让我们暂时🤗保持开放
Thanks. It is helpful!
Thanks for sharing the code @rjzhb, it's very helpful!感谢您分享代码,这很有帮助!
Most of our users are at billion scale and beyond. Most of them don't use
f32
and generally run 100+ threads, where concurrency and the cost of synchronization are higher priorities. Would be great if you have feedback or can contribute there!我们的大多数用户规模都在十亿甚至更高。他们中的大多数不使用f32
并且通常运行 100+ 线程,其中并发性和同步成本是更高的优先级。如果您有反馈或可以在那里做出贡献,那就太好了!So we don't actively optimize for the million-scale use-case and have a big ongoing release in prep - #402. We can come back to this later, once we done with the more urgent cases. Let's keep this open for now 🤗因此,我们不会针对百万级用例进行积极优化,而是在准备阶段进行大规模的持续发布 #402 。一旦我们处理了更紧急的案件,我们可以稍后再谈这个问题。让我们暂时🤗保持开放
I am planning to test it recently. Do you have any recommended parameter configurations? For example, how many threads do I need to open, using F32 or something else. Thank you so much.
Generally, during indexing, people use all available threads. For large-scale indexing machines with 100+ cores are recommended. If that's a new CPU f16
should be a good option. If your embeddings are training in a quantization-aware scheme - i8
and b1
are even better 🤗
Generally, during indexing, people use all available threads. For large-scale indexing machines with 100+ cores are recommended. If that's a new CPU
f16
should be a good option. If your embeddings are training in a quantization-aware scheme -i8
andb1
are even better 🤗Generally, during indexing, people use all available threads. For large-scale indexing machines with 100+ cores are recommended. If that's a new CPU
f16
should be a good option. If your embeddings are training in a quantization-aware scheme -i8
andb1
are even better 🤗
Thanks! But how could I use multiple threads? I mean, is there any special opening method that be stored in Usearch?
Here are the docs: https://github.com/unum-cloud/usearch/blob/main/cpp/README.md
Here are the docs: https://github.com/unum-cloud/usearch/blob/main/cpp/README.md
I knew. But it seems like I haven't seen any examples of multi-threaded searches, only multi-threaded indexing
Multi-threaded search looks almost identical. Just try and consider contributing extended docs once you succeed 🤗 You can also check tests and benchmarks for examples.
Multi-threaded search looks almost identical. Just try and consider contributing extended docs once you succeed 🤗 You can also check tests and benchmarks for examples.
Do you think it is ok?
for (size_t id = 0; id < number_of_queries; ++id) {
auto result = usearch_index.search(queries.get() + id * dimension, top_k);
}
Yes, that should work. Feel free to add to docs.
Describe the bug
On the SIFT1M test set, the speed of Usearch is much slower than HNSWlib, and there is not much difference between hnswlib and faiss.
Steps to reproduce
benchmark dataset: SIFT1M language: C++ comparision: Usearch vs hnswlib ef_construction: 200 M: 16 ef_search: 200~800(step:100)
result: https://github-production-user-asset-6210df.s3.amazonaws.com/105226542/335069356-ca67b231-ae7c-41ba-919f-5ddea85cde38.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240604%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240604T061724Z&X-Amz-Expires=300&X-Amz-Signature=a08e16c9618bb222c8a863ff18b91cae130662095895f9a65eccdc1dd64fb99d&X-Amz-SignedHeaders=host&actor_id=105226542&key_id=0&repo_id=725587775
Expected behavior
Usearch is nearly 10 times faster than HNSWLIB under the same recall.
I don't know if there are any special configurations, but I mainly use this method when using usearch:
Mainly using the following two interfaces
Then use these two interfaces to test
USearch version
v2.12.0
Operating System
Ubuntu 22.04
Hardware architecture
x86
Which interface are you using?
C++ implementation
Contact Details
rj986215159@outlook.com
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct