Open mhendrey opened 2 years ago
yes right, there is an inconsistency because some search-time parameters are stored (nprobe) and others are not (max_codes, polysemous_ht). Will mark this as an enhancement, so that we can implement it when we change the storage format once more.
Thanks for responding. Would you like me to close the issue or leave it open?
please leave open, that's the flow for enhancements...
Summary
I run a hyperparameter sweep for an IVF,PQ index which provides optimal values for nprobe,ht. I then set those parameters for the index and save it to disk. When I read from index, the nprobe is still the same, but the ht has been reset to 0.
Platform
OS: Ubuntu 20.04.3 LTS
Faiss version: 1.7.1
Installed from: anaconda, pytorch channel
Running on:
Interface:
Reproduction instructions
import faiss
n = 10000 d = 64 nlist = 100 X = faiss.randn((n, d))
index = faiss.index_factory(d, f"IVF{nlist},PQ8") print(f"{index.do_polysemous_training=:}") print(f"{index.by_residual=:}") index.train(X) index.add(X)
params = faiss.ParameterSpace() params.set_index_parameters(index, "nprobe=10,ht=32") faiss.write_index(index, "testing.index")
index2 = faiss.read_index("testing.index") assert index.nprobe == index2.nprobe, f"{index.nprobe=:}, {index2.nprobe=:}" assert index.polysemous_ht == index2.polysemous_ht, f"{index.polysemous_ht=:}, {index2.polysemous_ht=:}"
""" index.do_polysemous_training=True index.by_residual=True Traceback (most recent call last): File "faiss_bug.py", line 21, in
assert index.polysemous_ht == index2.polysemous_ht, \
AssertionError: index.polysemous_ht=32, index2.polysemous_ht=0
"""