Open jitendra-titaniam opened 1 year ago
@jitendra-titaniam thanks for raising the issue. Would you be able to contribute the fix?
So this is tricky. The problem is that the way we access the "native" index files is by directly passing the file path to the native libraries (i.e. faiss and nmslib). This means we are not accessing the index via the IndexInput abstraction, which is returned from the directory. This becomes tricky when using things like remove stores.
Workarounds are kind of tough:
Not sure what best route forward is for long-term solution. Implement native engines by reading from IndexInput would be very challenging if not impossible. We might be able to wrap this in more friendly abstraction that will work better with IndexStorePlugin though.
@jitendra-titaniam I have created this GH issue for using IndexInput for graph files. I am hoping it can solve the issue you faced: https://github.com/opensearch-project/k-NN/issues/2033
Will be completed in 2.18 once loading layer changes are completed.
@jitendra-titaniam this is the RFC for loading layer: https://github.com/opensearch-project/k-NN/issues/2033 please review and see if this can solve your issue. I believe it will
What is the bug? Opensearch
IndexStorePlugin
allows plugin developers to provide custom implementation oforg.apache.lucene.store.Directory
Plugin developer can write their own implementation ofDirectory.createOutput(String filename, IOContext context)
andDirectory.openInput(String filename, IOContext context)
methods. The current implementation inKNN80DocValuesConsumer.addKNNBinaryField
givesClassCastException
if any other implementation of Directory is used instead oforg.apache.lucene.store.FSDirectory
. Further more it does not use the Directory abstraction to createOutput rather writes to the IndexPath directly.How can one reproduce the bug? Steps to reproduce the behavior:
IndexStorePlugin
. ImplementgetDirectoryFactories()
. Forindex.store.type
oftitaniam
this method returns an subclass ofFsDirectoryFactory
that creates custom implementation of Directory rather than an FSDirectory.Following expection is shown in the opensearch.log
What is the expected behavior? Knn DocValuesConsumer should use the lucene Directory abstraction to add Binary Field, so that OS plugins implementing IndexStorePlugin can work.
What is your host/environment? Happening on all envs.
Do you have any screenshots? NA
Do you have any additional context?
Knn search on Elasticsearch 8.7.0 works with this plugin that implements IndexStorePlugin.
Following is the failing line. It is casting to FSDirectory
https://github.com/opensearch-project/k-NN/blob/ca5e483e1e70abdc19196e1018b3a7fd06908bf6/src/main/java/org/opensearch/knn/index/codec/KNN80Codec/KNN80DocValuesConsumer.java#L121
Following codes then goes on to directly write to lucene file without going through any of the Directory abstractions.
https://github.com/opensearch-project/k-NN/blob/ca5e483e1e70abdc19196e1018b3a7fd06908bf6/src/main/java/org/opensearch/knn/index/codec/KNN80Codec/KNN80DocValuesConsumer.java#L128 and https://github.com/opensearch-project/k-NN/blob/ca5e483e1e70abdc19196e1018b3a7fd06908bf6/src/main/java/org/opensearch/knn/index/codec/KNN80Codec/KNN80DocValuesConsumer.java#L143
These codes has to be resolved to write through Directory.