Closed donhardman closed 7 months ago
@tomatolog pls check if you can do it since Ilya is away.
not quite sure knn_ef
should be a new option like field_weights with the list of fields and fields weights or should be the new argument of the knn
SphinxQL filter function and the new parameter of the knn
JSON query property ?
should be the new argument of the knn SphinxQL filter function and the new parameter of the knn JSON query property
This is correct, just a new argument for the knn()
sql function and the corresponding JSON query.
NOTE: the new ef
parameter should be optional.
checked the code further and see that it is not possible to use different ef
on search it is the member of the HierarchicalNSW
and changes it via HierarchicalNSW::setEf
changes the value in all other HierarchicalNSW::searchKnn
calls
there are ways we could fix that:
hnsw
library to accept ef
argument of the HierarchicalNSW::searchKnn
functionef
outside the library like
k = max(k, ef)
searchKnn ( data, k )
if ( data.size()>k )
data.resize ( k );
however this way hnsw
library will fill up and return large priority_queue
just to drop most of the data there and as ef
could be thousands and millions that could affect memory fragmentation and performance
knn_index,ef
however that could make slow the 1st query with the new ef
and needs management of the instances cache Seems patch the library is the best way as we already use own fork manticoresoftware/hnswlib
patched the hnsw library at our fork https://github.com/manticoresoftware/hnswlib/commit/cf34d3ff8b43c652c6b0c7a468ee0570098f4209 then made a fix at the MCL::knn library https://github.com/manticoresoftware/columnar/commit/080c8c14334daf512734a86950487744dcfdc614 and added optional ef
agrument to SphinxQL knn
function and property into knn
JSON query https://github.com/manticoresoftware/manticoresearch/commit/74b44de18112f512d6704feb25d7094fa482b69e
master version of daemon with the knn library v3 should use ef
We need to expose the
ef
parameter in search queries to make it possible to set it at runtime while querying the database. The original HNSW library supports it (https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md) and Qdrant also has this parameter exposed as 'hnsw_ef'.