opensearch-project / k-NN

🆕 Find the k-nearest neighbors (k-NN) for your vector data
https://opensearch.org/docs/latest/search-plugins/knn/index/
Apache License 2.0
152 stars 113 forks source link

Refactor method structure and definitions #1920

Closed jmazanec15 closed 1 month ago

jmazanec15 commented 1 month ago

Description

This is a refactor around KNNMethod Definitions. It pre-work for interface for #1889. Functionally, nothing is changing. The overall scope/goal is to make it easier to integrate the quantization framework functionality and also make the plugin easier to maintain in the future.

This refactor has 4 components (I did one commit per)

  1. Remove getMethod from KNNLibrary interface. The motivation here is that the KNNMethod object is an implementation detail for an engine, so it does not make sense to expose it. The functionality exposed should be around validating configurations/params, and/or providing information relevant to building and/or searching the index produced by the KNNLibrary
  2. Move EngineSpecificMethodContext into KNNMethod. Motivation is that the KNNMethod is meant to encapsulate a libraries ANN configuration. This includes the parameters necessary for search.
  3. Change KNNMethod to interface and implement per Method we support. This resolves the dreaded configuration complexity we have in the engines by pulling the method definitions out into their own classes (see https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/engine/faiss/Faiss.java#L217-L311).
  4. Creating basic interface for Encoders. Similar to making KNNMethod an interface and implementing per method we support, I did the same with the Encoders. In the future, this will make it a lot easy to add Quantizers from the quantization framework.

The next steps after this is to change the KNNLibrary.getMethodAsMap to return a new Object that will both contain information to be passed to the library as well as the map needed to build the index - specifically, it will return the information needed by the quantization framework to configure the quantization. This will simplify the experience for building more complex objects.

Related Issues

1889

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

jmazanec15 commented 1 month ago

Sure Ill look into it - class hierarchy will help.