Closed waytehsu closed 3 years ago
The efficient way is to create a symmetric-distance table first and look it up when you want to approximate the distance between two PQ-codes.
Let's say you have 100-dim vectors with M=2 and Ks=256. After training the product quantizer, you'll obtain 2 (50dim 256) codewords.
Here, you can create symmetric-distance tables among codewords, D1 and D2. D1 is a 256x256 matrix for M=1. D1[0][1] is the L2 squared distance between the first and second codewords for M=1. D2[4][10] shows the distance between fifth and eleventh codewords for M=2.
Suppose you want to know the distance between the two PQ-codes, c1=[12, 8] and c2=[99, 3]. You can approximate it by D1[12][99] + D2[8][3]. This operation is fast; just looking up an array and summing.
Currently, I don't plan to implement this functionality because computing a symmetric-distance-table is an additional cost. If you want to know more, please refer to the followings
Not sure if this should be a feature request.
Supposed I just want to approximate distance between two PQ codes (under the same encoder of course). What is the most efficient way to perform such operation?