matsui528 / nanopq

Pure python implementation of product quantization for nearest neighbor search
MIT License
323 stars 43 forks source link

Updated version to v0.2.0 #31

Closed matsui528 closed 1 year ago

matsui528 commented 1 year ago

As the API has slightly changed, I updated the version from v0.1 to v0.2.

API

v0.1

pq = nanopq.PQ(M=4, Ks=256, verbose=True):
pq.fit(vecs=X, iter=20, seed=123)

opq = nanopq.OPQ(M=4, Ks=256, verbose=True):
opq.fit(vecs=X, parametric_init=False, pq_iter=20, rotation_iter=10, seed=123)

dtable = DistanceTable(dtable=dt)

v0.2

# New option "metric" is added
pq = nanopq.PQ(M=4, Ks=256, metric='l2', verbose=True)
# New option "minit" is added
pq.fit(vecs=X, iter=20, seed=123, minit="points")

# New option "metric" is added
opq = nanopq.OPQ(M=4, Ks=256, metric='l2', verbose=True):
# New option "minit" is added
opq.fit(vecs=X, parametric_init=False, pq_iter=20, rotation_iter=10, seed=123, minit="points")

# New option "metric" is added
dtable = DistanceTable(dtable=dt, metric="l2")

Incompatibitlity

In v0.2, a PQ instance has a new field for "metric" (i.e., pq.metric). The two PQ instance pq1 and pq2 are regarded as equal if all fields including metric are identical. Thus, a PQ instance created by v0.1 is not equal to that by v.0.2.

For example, suppose that we run the following code with nanopq==v0.1.11

import nanopq
import pickle
pq = nanopq.PQ(M=8)

with open('pq.pkl', 'wb') as f:
    pickle.dump(pq, f)

Then we update the library to nanopq==v0.2.0. and run the followings.

import nanopq
import pickle
pq = nanopq.PQ(M=8)

with open('pq.pkl', 'rb') as f:
    pq_dumped = pickle.load(f) 

assert pq_dumped == pq

It causes an error:

Traceback (most recent call last):
  File "/XXX/aaa.py", line 13, in <module>
    assert pq_dumped == pq
  File "/YYY/anaconda3/lib/python3.9/site-packages/nanopq/pq.py", line 72, in __eq__
    self.metric,
AttributeError: 'PQ' object has no attribute 'metric'

If you face such a problem, an easy workaround is adding a "metric" field.

with open('pq.pkl', 'rb') as f:
    pq_dumped = pickle.load(f) 

pq_dumped.metric = "l2"    # Hacky but works

assert pq_dumped == pq  # Ok!