Closed OctoberChang closed 11 months ago
Issue #, if available:
Description of changes: Implement C/C++ PairwiseANN function with Python API.
Given a (input, label) pair, PairwiseANN finds top-K nearest queries in the indexed input-to-label graph. The searcher return four arrays of size K:
K
Imat
Mmat
Dmat
Vmat
Indexing usage
train_params = PairwiseANN.TrainParams(metric_type="ip") model = PairwiseANN.train(X_trn, Y_csr, train_params=train_params)
where
X_trn
Y_csr
Save/load Usage
model.save(model_folder) del model model = PairwiseANN.load(model_folder)
Searchers Usage
searchers = model.searchers_create(max_batch_size=100, max_only_topk=10, num_searcher=1)
max_batch_size
max_only_topk
Prediction Usage With same input (suitable for real-time inference):
It, Mt, Dt, Vt = model.predict(query_vec, label_keys, searchers, pred_params=pred_params, is_same_input=True)
query_vec
(1, feat_dim)
label_keys
(batch_size, )
With different inputs (suitable for batch prediction):
It, Mt, Dt, Vt = model.predict(Query_mat, label_keys, searchers, pred_params=pred_params, is_same_input=False)
Query_mat
(batch_size, feat_dim)
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Issue #, if available:
Description of changes: Implement C/C++ PairwiseANN function with Python API.
Given a (input, label) pair, PairwiseANN finds top-K nearest queries in the indexed input-to-label graph. The searcher return four arrays of size
K
:Imat
: top-K input indicesMmat
: the masking indicators. 1 indicates the kNN index is presented; otherwise 0.Dmat
: the kNN distances between the test input and top-K indexed input.Vmat
: the corresponding values stored in the input-to-label graph.Indexing usage
where
X_trn
is the dense or sparse input feature matrixY_csr
is the sparse input-to-label graphSave/load Usage
Searchers Usage
where
max_batch_size
defines the maximum number of rows in the returned arraymax_only_topk
defines the maximum number of cols in the returned arrayPrediction Usage With same input (suitable for real-time inference):
query_vec
is a single feature vector of shape(1, feat_dim)
label_keys
is a np.array of shape(batch_size, )
With different inputs (suitable for batch prediction):
Query_mat
is a batch feature matrix of shape(batch_size, feat_dim)
label_keys
is a np.array of shape(batch_size, )
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.