youngkyunJang / GPQ

Generalized Product Quantization Network For Semi-supervised Image Retrieval - CVPR 2020
MIT License
62 stars 11 forks source link

how to generate the hash codes for custom data? #3

Closed LiuChaoXD closed 3 years ago

LiuChaoXD commented 3 years ago

I find the program is computing distance based on table Z. but this way is hard to applying my own experiments since I want to extract the hash codes for my custom dataset. Then I can plot some precision recall curve and so on. could you add the way of extracting hash codes for some dataset?

youngkyunJang commented 3 years ago

Hello.

What did you mean for hash codes?

If you mean hash codes as combination of code words, you can make hash codes by utilizing binary-encoded indices to load specific codeword.

LiuChaoXD commented 3 years ago

Thanks for your reply. I want to extract each images' binary codes. And I read your GPQ paper. In the subsection "build retrieval Database", you write " After learning the entire GPQ framework, we can build a retrieval database using images in Xu. Given an input image I ∈X ,we first .... This procedure is repeated for all images to store them as binary, and Z is also stored for distance calculation."

In my comparisons, I want to extract the data set's binary hash codes. based on this, I can plot some retrieval curve, i.e., PR curve.

And I try to extract images hash codes, but I fail.

Could you provide some implementations for extracting images' binary codes?

youngkyunJang commented 3 years ago

Oh I see.

You can find indexing (making binary codes) in utils/RetrievalTest.py

def Indexing(Z, descriptor, numSeg):

    x = tf.split(descriptor, numSeg, 1)
    y = tf.split(Z, numSeg, 1)
    for i in range(numSeg):
        size_x = tf.shape(x[i])[0]
        size_y = tf.shape(y[i])[0]
        xx = tf.expand_dims(x[i], -1)
        xx = tf.tile(xx, tf.stack([1, 1, size_y]))

        yy = tf.expand_dims(y[i], -1)
        yy = tf.tile(yy, tf.stack([1, 1, size_x]))
        yy = tf.transpose(yy, perm=[2, 1, 0])
        diff = tf.reduce_sum(tf.multiply(xx,yy), 1)

        arg = tf.argmax(diff, 1)
        max_idx = tf.reshape(arg, [-1, 1])

        if i == 0:
            quant_idx = max_idx
        else:
            quant_idx = tf.concat([quant_idx, max_idx], axis=1)
    return quant_idx

in here, quant_idx denotes binary code (index of closest codeword).

LiuChaoXD commented 3 years ago

Oh I see.

You can find indexing (making binary codes) in utils/RetrievalTest.py

def Indexing(Z, descriptor, numSeg):

    x = tf.split(descriptor, numSeg, 1)
    y = tf.split(Z, numSeg, 1)
    for i in range(numSeg):
        size_x = tf.shape(x[i])[0]
        size_y = tf.shape(y[i])[0]
        xx = tf.expand_dims(x[i], -1)
        xx = tf.tile(xx, tf.stack([1, 1, size_y]))

        yy = tf.expand_dims(y[i], -1)
        yy = tf.tile(yy, tf.stack([1, 1, size_x]))
        yy = tf.transpose(yy, perm=[2, 1, 0])
        diff = tf.reduce_sum(tf.multiply(xx,yy), 1)

        arg = tf.argmax(diff, 1)
        max_idx = tf.reshape(arg, [-1, 1])

        if i == 0:
            quant_idx = max_idx
        else:
            quant_idx = tf.concat([quant_idx, max_idx], axis=1)
    return quant_idx

in here, quant_idx denotes binary code (index of closest codeword).

I get it. Thank u very much.

LiuChaoXD commented 3 years ago

Oh I see.

You can find indexing (making binary codes) in utils/RetrievalTest.py

def Indexing(Z, descriptor, numSeg):

    x = tf.split(descriptor, numSeg, 1)
    y = tf.split(Z, numSeg, 1)
    for i in range(numSeg):
        size_x = tf.shape(x[i])[0]
        size_y = tf.shape(y[i])[0]
        xx = tf.expand_dims(x[i], -1)
        xx = tf.tile(xx, tf.stack([1, 1, size_y]))

        yy = tf.expand_dims(y[i], -1)
        yy = tf.tile(yy, tf.stack([1, 1, size_x]))
        yy = tf.transpose(yy, perm=[2, 1, 0])
        diff = tf.reduce_sum(tf.multiply(xx,yy), 1)

        arg = tf.argmax(diff, 1)
        max_idx = tf.reshape(arg, [-1, 1])

        if i == 0:
            quant_idx = max_idx
        else:
            quant_idx = tf.concat([quant_idx, max_idx], axis=1)
    return quant_idx

in here, quant_idx denotes binary code (index of closest codeword).

sorry about the same question. I want to extract each image's binary codes. and you say "quant_idx denotes binary code". i found the the matrix Z is real-valued. should I binarize matrix Z and then employ the Indexing function to index the closest codeword. finally the binary can be generated by concatenating bˆ R = [bR1 , ..., bRM ]. I don't know the steps are correct. so i confirm with you.

I also carefully read the GPQ paper. in the build retrieval database section, "After that, formatting a index k of the nearest codeword as binary to generate a sub-binary code bR ". what's the meaning of formatting a index k?

youngkyunJang commented 3 years ago

Hello, matrix Z collects N codebooks (which contains M codewords with d-dimensions), that is Z = [M, Nxd]. In here, codewords are real-valued vectors. To build a retrieval database based on Product Quantization, only the positions (indices) of codewords are stored. In this case, the indicies are saved as binary code (decimal encoded in binary). So, the binary codes are only denoting the positions, which do not contain any discriminative representations.

Above code I wrote, max_idx denotes position (index) of closest codeword. So you should binary max_idx to obtain binary code.

LiuChaoXD commented 3 years ago

Hello, matrix Z collects N codebooks (which contains M codewords with d-dimensions), that is Z = [M, Nxd]. In here, codewords are real-valued vectors. To build a retrieval database based on Product Quantization, only the positions (indices) of codewords are stored. In this case, the indicies are saved as binary code (decimal encoded in binary). So, the binary codes are only denoting the positions, which do not contain any discriminative representations.

Above code I wrote, max_idx denotes position (index) of closest codeword. So you should binary max_idx to obtain binary code.

ok, thanks very much for your replying. i understand the idea. thank u

LiuChaoXD commented 3 years ago

my problem has been solved. so I closed the issue. Thanks the author.