One additional training endpoint is introduced in this PR.
To use a trainable faiss indexer (e.g., IVF based):
1) we can first apply a training flow to train a faiss indexer (absolutely, user can also implement a locally training script using native faiss api):
from jina import Flow
import numpy as np
train_filepath = 'train.npy'
train_data = np.array(np.random.random([10240, 256]), dtype=np.float32)
np.save(train_filepath, train_data)
f = Flow().add(
uses="jinahub://FaissSearcher",
timeout_ready=-1,
uses_with={
'index_key': 'IVF10_HNSW32,PQ64',
'trained_index_file': 'faiss.index',
'on_gpu': False,
},
)
with f:
# the trained index will be dumped to "faiss.index"
f.post(on='/train', parameters={'train_filepath': train_filepath})
2) Then in the query runtime, we can use FaissSearch by providing a pre-trained index file resulted from step 1, e.g.,
One additional training endpoint is introduced in this PR.
To use a trainable faiss indexer (e.g., IVF based):
1) we can first apply a training flow to train a faiss indexer (absolutely, user can also implement a locally training script using native faiss api):
2) Then in the query runtime, we can use FaissSearch by providing a pre-trained index file resulted from step 1, e.g.,
One limitation to mention is:
Hence the
index
operation can not utilize multi-core CPUs to speed up.