Closed LiuChaoXD closed 3 years ago
Hello.
What did you mean for hash codes?
If you mean hash codes as combination of code words, you can make hash codes by utilizing binary-encoded indices to load specific codeword.
Thanks for your reply. I want to extract each images' binary codes. And I read your GPQ paper. In the subsection "build retrieval Database", you write " After learning the entire GPQ framework, we can build a retrieval database using images in Xu. Given an input image I ∈X ,we first .... This procedure is repeated for all images to store them as binary, and Z is also stored for distance calculation."
In my comparisons, I want to extract the data set's binary hash codes. based on this, I can plot some retrieval curve, i.e., PR curve.
And I try to extract images hash codes, but I fail.
Could you provide some implementations for extracting images' binary codes?
Oh I see.
You can find indexing (making binary codes) in utils/RetrievalTest.py
def Indexing(Z, descriptor, numSeg):
x = tf.split(descriptor, numSeg, 1)
y = tf.split(Z, numSeg, 1)
for i in range(numSeg):
size_x = tf.shape(x[i])[0]
size_y = tf.shape(y[i])[0]
xx = tf.expand_dims(x[i], -1)
xx = tf.tile(xx, tf.stack([1, 1, size_y]))
yy = tf.expand_dims(y[i], -1)
yy = tf.tile(yy, tf.stack([1, 1, size_x]))
yy = tf.transpose(yy, perm=[2, 1, 0])
diff = tf.reduce_sum(tf.multiply(xx,yy), 1)
arg = tf.argmax(diff, 1)
max_idx = tf.reshape(arg, [-1, 1])
if i == 0:
quant_idx = max_idx
else:
quant_idx = tf.concat([quant_idx, max_idx], axis=1)
return quant_idx
in here, quant_idx denotes binary code (index of closest codeword).
Oh I see.
You can find indexing (making binary codes) in utils/RetrievalTest.py
def Indexing(Z, descriptor, numSeg): x = tf.split(descriptor, numSeg, 1) y = tf.split(Z, numSeg, 1) for i in range(numSeg): size_x = tf.shape(x[i])[0] size_y = tf.shape(y[i])[0] xx = tf.expand_dims(x[i], -1) xx = tf.tile(xx, tf.stack([1, 1, size_y])) yy = tf.expand_dims(y[i], -1) yy = tf.tile(yy, tf.stack([1, 1, size_x])) yy = tf.transpose(yy, perm=[2, 1, 0]) diff = tf.reduce_sum(tf.multiply(xx,yy), 1) arg = tf.argmax(diff, 1) max_idx = tf.reshape(arg, [-1, 1]) if i == 0: quant_idx = max_idx else: quant_idx = tf.concat([quant_idx, max_idx], axis=1) return quant_idx
in here, quant_idx denotes binary code (index of closest codeword).
I get it. Thank u very much.
Oh I see.
You can find indexing (making binary codes) in utils/RetrievalTest.py
def Indexing(Z, descriptor, numSeg): x = tf.split(descriptor, numSeg, 1) y = tf.split(Z, numSeg, 1) for i in range(numSeg): size_x = tf.shape(x[i])[0] size_y = tf.shape(y[i])[0] xx = tf.expand_dims(x[i], -1) xx = tf.tile(xx, tf.stack([1, 1, size_y])) yy = tf.expand_dims(y[i], -1) yy = tf.tile(yy, tf.stack([1, 1, size_x])) yy = tf.transpose(yy, perm=[2, 1, 0]) diff = tf.reduce_sum(tf.multiply(xx,yy), 1) arg = tf.argmax(diff, 1) max_idx = tf.reshape(arg, [-1, 1]) if i == 0: quant_idx = max_idx else: quant_idx = tf.concat([quant_idx, max_idx], axis=1) return quant_idx
in here, quant_idx denotes binary code (index of closest codeword).
sorry about the same question. I want to extract each image's binary codes. and you say "quant_idx denotes binary code". i found the the matrix Z is real-valued. should I binarize matrix Z and then employ the Indexing function to index the closest codeword. finally the binary can be generated by concatenating bˆ R = [bR1 , ..., bRM ]. I don't know the steps are correct. so i confirm with you.
I also carefully read the GPQ paper. in the build retrieval database section, "After that, formatting a index k of the nearest codeword as binary to generate a sub-binary code bR ". what's the meaning of formatting a index k?
Hello, matrix Z collects N codebooks (which contains M codewords with d-dimensions), that is Z = [M, Nxd]. In here, codewords are real-valued vectors. To build a retrieval database based on Product Quantization, only the positions (indices) of codewords are stored. In this case, the indicies are saved as binary code (decimal encoded in binary). So, the binary codes are only denoting the positions, which do not contain any discriminative representations.
Above code I wrote, max_idx denotes position (index) of closest codeword. So you should binary max_idx to obtain binary code.
Hello, matrix Z collects N codebooks (which contains M codewords with d-dimensions), that is Z = [M, Nxd]. In here, codewords are real-valued vectors. To build a retrieval database based on Product Quantization, only the positions (indices) of codewords are stored. In this case, the indicies are saved as binary code (decimal encoded in binary). So, the binary codes are only denoting the positions, which do not contain any discriminative representations.
Above code I wrote, max_idx denotes position (index) of closest codeword. So you should binary max_idx to obtain binary code.
ok, thanks very much for your replying. i understand the idea. thank u
my problem has been solved. so I closed the issue. Thanks the author.
I find the program is computing distance based on table Z. but this way is hard to applying my own experiments since I want to extract the hash codes for my custom dataset. Then I can plot some precision recall curve and so on. could you add the way of extracting hash codes for some dataset?