Using RRT to compute the similarity between 2 batch of images

elias-ramzi commented 1 year ago

Hi, first thank you for this great repo!

In the code for RRT_SOP the matcher computes the pairwise similarity between src_local and tgt_local. So for a batch size of BS the output of matcher(src_local, tgt_local) is a tensor of size BS. Is there an efficient way to compute the similarity of all pairs in the two batches of image? ie leading to an output of size BSxBS.

Thanks for any help!

Elias

macTracyHuang commented 1 year ago

Hi, first thank you for this great repo!

In the code for RRT_SOP the matcher computes the pairwise similarity between src_local and tgt_local. So for a batch size of BS the output of matcher(src_local, tgt_local) is a tensor of size BS. Is there an efficient way to compute the similarity of all pairs in the two batches of image? ie leading to an output of size BSxBS.

Thanks for any help!

Elias

May I ask your purpose of calculating simliarity across batches? if you are trying to get similarity among all queries and the gallery, in recall_at_ks_rerank, the author already do that:


for i in tqdm(range(top_k)):
        k_scores = []
        for j in range(0, num_samples, bsize):
            current_query = query_features[j:(j+bsize)]
            current_index = gallery_features[cache_nn_inds[j:(j+bsize), i]]
            start = time.time()
            current_scores, _, _ = matcher(None, True, 
                src_global=None, src_local=current_query.to(device), 
                tgt_global=None, tgt_local=current_index.to(device))
            end = time.time()
            total_time += end-start
            k_scores.append(current_scores.cpu())
        k_scores = torch.cat(k_scores, 0)
        scores.append(k_scores)
    print('time', total_time/num_samples)
    scores = torch.stack(scores, -1)

elias-ramzi commented 1 year ago

Hi,

Thank you for your answer. It works indeed.

However I want to do that during training and I was wondering if there was a way to to that batch-wise.

Thanks

fwtan commented 1 year ago

Hi,

Thank you for your answer. It works indeed.

However I want to do that during training and I was wondering if there was a way to to that batch-wise.

Thanks

Hi,

A potential solution would be 1) reshaping src_local: (bsize, fsize, h, w) --> (bsize, 1, fsize, h, w) --> (bsize, bsize, fsize, h, w) --> (bsize bsize, fsize, h, w) 2) reshaping tgt_local: (bsize, fsize, h, w) --> (1, bsize, fsize, h, w) --> (bsize, bsize, fsize, h, w) --> (bsize bsize, fsize, h, w) 3) using the new tensors as the inputs

Please note the difference in bold.

In general, this may increase the complexity by a factor of bsize.

elias-ramzi commented 1 year ago

Hi,

Thank you for your response! I have tried this it works but indeed takes a lot of computes. I think I am going to have to hack a bit the forward pass :)

Thanks a lot!

uvavision / RerankingTransformer

Using RRT to compute the similarity between 2 batch of images #14