What is VSAG

VSAG is a vector indexing library used for similarity search. The indexing algorithm allows users to search through various sizes of vector sets, especially those that cannot fit in memory. The library also provides methods for generating parameters based on vector dimensions and data scale, allowing developers to use it without understanding the algorithm’s principles. VSAG is written in C++ and provides a Python wrapper package called pyvsag. Developed by the Vector Database Team at Ant Group.


VSAG provides an optimized HNSW implementation that achieves state-of-the-art (SOTA) performance on the GIST dataset. The test in ann-benchmarks is running on an r6i.16xlarge machine on AWS with --parallelism 31, single-CPU, and hyperthreading disabled. The result is as follows:


Getting Started

Integrate with CMake

# CMakeLists.txt
cmake_minimum_required(VERSION 3.11)

project (myproject)


# download and compile vsag
include (FetchContent)
FetchContent_Declare (
  GIT_TAG master
FetchContent_MakeAvailable (vsag)
include_directories (vsag-cmake-example PRIVATE ${vsag_SOURCE_DIR}/include)

# compile executable and link to vsag
add_executable (vsag-cmake-example src/main.cpp)
target_link_libraries (vsag-cmake-example PRIVATE vsag)

# add dependency
add_dependencies (vsag-cmake-example vsag)

Try the Example

#include <vsag/vsag.h>

#include <iostream>

main(int argc, char** argv) {

    int64_t num_vectors = 10000;
    int64_t dim = 128;

    // prepare ids and vectors
    auto ids = new int64_t[num_vectors];
    auto vectors = new float[dim * num_vectors];

    std::mt19937 rng;
    std::uniform_real_distribution<> distrib_real;
    for (int64_t i = 0; i < num_vectors; ++i) {
        ids[i] = i;
    for (int64_t i = 0; i < dim * num_vectors; ++i) {
        vectors[i] = distrib_real(rng);

    // create index
    auto hnsw_build_paramesters = R"(
        "dtype": "float32",
        "metric_type": "l2",
        "dim": 128,
        "hnsw": {
            "max_degree": 16,
            "ef_construction": 100
    auto index = vsag::Factory::CreateIndex("hnsw", hnsw_build_paramesters).value();
    auto base = vsag::Dataset::Make();

    // prepare a query vector
    auto query_vector = new float[dim];  // memory will be released by query the dataset
    for (int64_t i = 0; i < dim; ++i) {
        query_vector[i] = distrib_real(rng);

    // search on the index
    auto hnsw_search_parameters = R"(
        "hnsw": {
            "ef_search": 100
    int64_t topk = 10;
    auto query = vsag::Dataset::Make();
    auto result = index->KnnSearch(query, topk, hnsw_search_parameters).value();

    // print the results
    std::cout << "results: " << std::endl;
    for (int64_t i = 0; i < result->GetDim(); ++i) {
        std::cout << result->GetIds()[i] << ": " << result->GetDistances()[i] << std::endl;

    // free memory
    delete[] ids;
    delete[] vectors;

    return 0;

Developer Guide


# for Debian/Ubuntu
$ ./scripts/deps/

# for CentOS/AliOS
$ ./scripts/deps/

VSAG Build Tool

Usage: make <target>

help:                   ## Show the help.
debug:                  ## Build vsag with debug options.
release:                ## Build vsag with release options.
distribution:           ## Build vsag with distribution options.
fmt:                    ## Format codes.
test:                   ## Build and run unit tests.
asan:                   ## Build with AddressSanitizer option.
test_asan: asan         ## Run unit tests with AddressSanitizer option.
tsan:                   ## Build with ThreadSanitizer option.
test_tsan: tsan         ## Run unit tests with ThreadSanitizer option.
test_cov:               ## Build and run unit tests with code coverage enabled.
clean:                  ## Clear build/ directory.
install:                ## Build and install the release version of vsag.

Project Structure

benchs/: benchmark script in Python
cmake/: cmake util functions
docker/: the dockerfile to build develop and ci image
examples/: cpp and python example codes
externs/: third-party libraries
include/: export header files
mockimpl/: the mock implementation that can be used in interface test
python_bindings/: the python bindings
scripts/: useful scripts
src/: the source codes and unit tests
tests/: the functional tests


Contribution Guidelines

Contributions are welcomed and greatly appreciated. Please read our contribution guidelines for detailed contribution workflow.

Star History

