[RFC] Disk-based Mode Design

Introduction

Document proposes design for mode parameter for disk-based features. For engineers reviewing, please read Appendix A: Existing Configuration Concepts in k-NN Plugin for background on important concepts for supporting multiple methods in the k-NN plugin. Refer to Appendix B: Terminology for common terminology.

Problem Statement

k-NN search is a vast problem space and users have a diverse set of requirements. For instance, one user may really only care about cost and accuracy, while another has strict search latency requirements. In the current version of the k-NN plugin, users need to tune low-level algorithmic parameters in order to achieve the tradeoffs they want. For example, to create an index now, the mapping may look like:

PUT my-vector-index
{
  "mappings": {
    "properties": {
      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "method": {
            "name": "hnsw",
            "engine": "faiss",
            "space_type": "l2",
            "params": {
              "m": 16,
              "ef_construction": 512,
              "encoder": {
                "name": "sq",
                "params": {
                  "bits": 8
                }
              }
            }
        }
      }
    }
  }
}

This method of fine-tuning creates a complex user experience. In other words, there is a significant amount of effort required to achieve desired performance profile.

As we introduce new disk-based features, this experience is only going to get more complex. So, as a part of this project, we want to ensure that we provide an experience that makes it very easy to onboard, while still giving users flexibility to tune as far as they want to.

Requirements

Functional

User should be able to provide a minimal indication that they want to use disk-optimized features for their k-NN field at index creation, and the features get enabled with sensible defaults
User should be able to make adjustments to their index configuration to tune for certain performance characteristics. For instance, if a user wants to increase the recall, they should be able to adjust parameters to do so
On upgrade, a user’s index configuration should stay constant (i.e. when upgrading, we should not switch underlying parameters).

Non-functional

Documentation should clearly call out what the default values are for certain workload configurations
Plugin developers should be able to easily and incrementally improve upon default configurations

Out of scope

Compression parameter design
Space type location in mapping. Will assume it is outside of the method, without loss of generality
Experimentation for final parameter configuration for disk-based mode.

Proposed Solution

We are going to provide a new mapping parameter called mode that will allow users to indicate what kind of performance profile that they are looking for. Functionally, mode will control what default parameters are selected for each mapping configuration option as well as search parameter configuration options. With this, users will be able to override select parameters by specifying them in the mapping. If a configuration is not supported for a given mode, a validation exception will be thrown. Initially, we will introduce 2 modes: in-memory (our current default) and on-disk (our new configuration with quantization framework and binary field type). In the future, we may decide to introduce more.

From an implementation perspective, we will sharpen our existing interfaces for configuring k-NN indices and add some extensions to support this new functionality. We will establish a policy on upgrades that mode configurations can change between minor versions, but configurations of indices created on an older version will remain constant on upgrade.

High Level Design

API Experience

Index Creation Basic:

PUT my-vector-index
{
  "mappings": {
    "properties": {
      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "space_type": "l2",
        "data_type": "float",
        "mode": "on-disk"
      }
    }
  }
}

Index Creation Fine-tune

PUT my-vector-index
{
  "mappings": {
    "properties": {
      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "space_type": "l2",
        "data_type": "float",
        "mode": "on-disk",
        "compression_level": "16x",
        "method": {
            "params": {
                "ef_construction": 16
            }
        }
      }
    }
  }
}

For fine-tuning, the user can override some parameters. Parameters that are not overridden will be selected based on the mode provided. Notice that inside the method parameter, name and engine are missing. These should be inferred based on mode.

Search - basic

GET my-vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector_field": {
        "vector": [1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5],
        "k": 10
      }
    }
  }
}

On search, because the mode is on-disk, the correct re-score factor will automatically be set.

Search - fine-tune

GET my-vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector_field": {
        "vector": [1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5],
        "k": 10,
        "method_params": {
            "ef_search": 10
        }
        "rescore": {
            "oversample_factor": 10.0
        }
      }
    }
  }
}

Architecture

On indexing, we need to build the IndexBuildContext from the user provided context as well as defaults selected based on the mode to pass to the indexing builder components to properly configure the ANN index structure. There are 3 main components: The engine selector (what library is going to build the index), the method selector (what algorithm) and the parameter selector (what parameters of the algorithm - the quantization parameters will be plugged in here). For index creation, this will be hierarchical, meaning that the engine encapsulates the method which encapsulates its parameters.

On search, a similar workflow will happen. But, at search time, we will take into account the context in which the index was built.

Initial Parameter Configurations

Initially, the plugin will support 2 modes: in-memory (default) and on-disk. in-memory will be our existing default. So, it will use the existing defaults:

Param	Value
engine	nmslib
method	hnsw
efc	100
efs	100
m	16
quantizer	none
rescore-oversample	none

on-disk (subject to change with testing) will use:	Param	Value
engine	faiss
method	hnsw
efc	100
efs	100
m	16
quantizer	bq 32x
rescore-oversample	2x

Training Behavior

For the train API, we will similarly add a mode parameter which will have a similar effect. The behavior will be consistent with the architecture defined above.

Bit/Byte data_type Behavior

For non-float data types (i.e. byte or bit), because we do not support quantization, we will not support on-disk mode. In the future, this may change. However, for now, we will throw validation errors if they are used in tandem.

Alternatives Considered

Alternative #1: Mutual exclusion between mode and algorithmic parameters

The main alternative we considered is making the mode parameter mutually exclusive with the method and/or compression parameters. In this approach, a user could either fine tune, or select a mode.

Pros

Gives significant room for us to make adjustments to the experience. We do not need to worry about also enabling fine-tuning and how the 2 parameters will coexist

Cons

Major gap between experiences which may appear broken. Realistically, especially in the initial releases, users will continue to want to make adjustments to our defaults. Thus, to do this, they will need to understand to disjoint experiences

While this approach would allow us to make changes without worrying about exposing these changes to the user, this goes against our current experience and would also limit users who need extra level of control.

Low Level Design

Brief Background

In order to configure the index and provide search specific parameters, the plugin relies on the KNNLibrary interface to provide KNNLibraryIndexingContext and KNNLibrarySearchContext objects, given a KNNMethodContext ^. The respective functions are.

KNNLibrary Interface:
    KNNLibraryIndexingContext getKNNLibraryIndexingContext(KNNMethodContext knnMethodContext);

    KNNLibrarySearchContext getKNNLibrarySearchContext(String methodName);

There are similar methods for validation, but will focus on the context object construction for now. On indexing, this method is called in order to pass the map of parameters for configuring the index as FieldInfo attributes from our mapper to our custom codec code (reference). On search, we parse out the method name for parameter validation in the KNNQueryBuilder and call this method to similarly get the KNNLibrarySearchContext (reference).

Index Configuration

On indexing, we are still going to pass the configuration details to the custom codec via the FieldInfo attributes. The main change will be that we will add a new class to the KNNMethodContext called ConfigurationContext that will contain non-serialized information about the KNNMethodContext at the time of construction. We already do something similar with the Index created version in the MethodComponentContext to originally support adjusted defaults (reference).

class KNNMethodContext {
    String name;
    KNNEngine engine;
    MethodComponentContext methodComponentContext;
    ConfigurationContext configurationContext; // new class
}

class ConfigurationContext {
    Version createdVersion,
    Mode mode,
    DataType dataType,
    int dimension
}

enum Mode {
    IN_MEMORY,
    ON_DISK
}

This new ConfigurationContext class will allow us to add the more complex validation logic and default index configuration vending logic to our existing components to properly build the correct configurations.

The overall calling flow will roughly be the same as the existing configuration. The main difference is that it will need to account for the additional context. We will also need to add a class that can resolve the engine, method and parameters based on this context if not provided.

Search Configuration

On search, we will similarly utilize the new ConfigurationContext in the KNNMethodContext in order to properly validate and build the KNNLibrarySearchContext. Additionally, for re-scoring parameter validation, we will also use the ConfigurationContext to select the correct parameters from the context of the KNNEngine being used. In order to do this, we are going to need to enhance our existing structure to always provide the KNNMethodContext even with legacy or model-based indices. Additionally, we will need to create a search configuration object that can be passed to the engine in order to setup the configuration as well as a RescoreContext.

class SearchConfigContext {
    KNNMethodContext methodContext,
    RescoreContext rescoreContext // optional
}

class RescoreContext {
    boolean enabled,
    Float oversampleFactor, // optional
}

The RescoreParameter resolution will be used in the context of the FaissMethod just because it will need to take into account things such as quantization when validating/configuring.

^ Right now, the search functionality takes in a methodName instead of KNNMethodContext to supply parameters. This will be changes as part of this project.

Backwards Compatibility

From a backwards compatibility perspective, the main concern is around what happens when we want to change the configuration of a mode in a new release. For instance, what happens when we want to switch from the HNSW algorithm to the Vamana algorithm for disk-based indices. For this, we will follow a simple policy: An index’s configuration is tied to the version of the OpenSearch cluster the index was created on; developers are free (with proper testing of course) to update a mode’s configuration between minor version releases. This policy gives a strong balance between providing a smooth upgrade experience while giving developers room to improve the system.

Appendix A: Existing Configuration Concepts in k-NN Plugin #

KNNLibrary

The KNNLibrary Abstraction is what each engine extends in order to supply configurations to be used with the respective libraries during search and/or index building. In the future, the KNNlibrary should encapsulate all engine specific functionality, but for now it just does the following important things:

Validates the KNNMethodContexts for the engines
Builds the index configuration parameters passed to the libraries to build the index
Builds the search time parameters and similarly validates.

See the code for more details.

KNNMethod

KNNMethod abstraction encapsulates the structure of a particular ANN method for an engine. This contains the ANN algorithmic definition as well as the encoders used for quantization.

The implementation of KNNMethod is AbstractKNNMethod. AbstractKNNMethod contains a MethodComponent that contains a name as well as a set of supported Parameters. The Parameter abstraction is configured to have a default value and functionality to validate user provided configurations.

KNNMethodContext

The KNNMethodContext encapsulates the user-provided information for building the index. It contains 2 main components: (1) The engine to use and (2) the MethodComponentContext. The MethodComponentContext is the name of the algorithm and a map of parameters that are used to configure the method (i.e. hnsw_m=16). One type of MethodComponentContext parameter is a MethodComponentContext - so there can be recursive definitions.

Appendix B: Terminology #

method — refers to implementation of a specific ANN algorithm. For example, faiss’s HNSW is considered a method.

opensearch-project / k-NN