Closed janandreschweiger closed 3 years ago
Hey Jina team,
my team has indexed over 100.000 text documents using the FaissIndexer. Now we would like to test different hyperparamters:
- distance: "l2" vs "inner_product"
- normalize: True vs False
Unfortunately, if we now make any changes to these parameters in our yml-file nothing changes. This is probably, because the indexer is saved and loaded again. For testing different hyperparamters it is however very unpleasant, as we would have to index all documents again for every setup. Indexing all documents takes several hours.
We tried to overwrite the FaissIndexer, but jina doesn't recognize custom executors that were added after index-time:
could not determine a constructor for the tag '!CustomFaissIndexer'
Is there a way to alter these paramers for querying?
Hey @janandreschweiger, thank you again for your valuable feedback.
This is possible using ref_indexer
as a composite for a NumpyIndexer
and wrapping it with a FaissIndexer
or any other Indexer
with a different set of parameters.
This has some troubles and will be addressed in #1438.
By the way, we are working in a feature to provide hyperparameter optimization
so please stay tuned!
Thanks for your reply @JoanFM! That's cool, especially if one wants to change the parameters in production. Also hyperparameter tuning is important for many applications.
Hey @janandreschweiger .
Now that I think this should be already available for you since you are not using FaissIndexer from a Docker Image.
What you need to do is to have at Index time a NumpyIndexer as indexer.
Then at Query time, you can have a FaissIndexer that gets as a ref_indexer
a NumpyIndexer with the parameters of the one at Index time, that is from where the FaissIndexer will load the data.
You can see this feature being used (not successful now because it uses Containers) in the faiss search example.
It may help you to go ahead of the issue in #1438
Thanks @JoanFM I'll try it out!
Hey @janandreschweiger ,
There is a PR open in the examples that showcases what you are trying to do.
https://github.com/jina-ai/examples/pull/318
It lets you index with an indexer, and then reuse that index data to query with different indexers types and parameters.
I hope you find it useful
Hey Jina team,
my team has indexed over 100.000 text documents using the FaissIndexer. Now we would like to test different hyperparamters:
Unfortunately, if we now make any changes to these parameters in our yml-file nothing changes. This is probably, because the indexer is saved and loaded again. For testing different hyperparamters it is however very unpleasant, as we would have to index all documents again for every setup. Indexing all documents takes several hours.
We tried to overwrite the FaissIndexer, but jina doesn't recognize custom executors that were added after index-time:
Is there a way to alter these paramers for querying?