vimar-gu / MSINet

[CVPR2023] Twins Contrastive Search of Multi-Scale Interaction for Object Re-Identification
66 stars 7 forks source link

Could not get accuracy that is mentioned in paper #13

Closed bilal6414 closed 2 months ago

bilal6414 commented 4 months ago

I am using pre trained weights to get embedding and then calculating difference to ReID of images. But I am not geting results as I was expection and mentioned in paper. Please let me know do I need to train on my own dataset first. secondly please review code that I am using to get embedding .

`import os import sys import torch import random import numpy as np import csv import matplotlib.pyplot as plt from PIL import Image from torchvision import transforms from torch.backends import cudnn from reid.utils.logging import Logger from reid.models.msinet import msinet_x1_0 from reid.utils.serialization import copy_state_dict def count_parameters(model): return np.sum(np.fromiter((np.prod(v.size()) for name, v in model.named_parameters() if 'classifier' not in name), dtype=np.float32)) / 1e6 def preprocess_image(image_path, height, width): transform = transforms.Compose([ transforms.Resize((height, width)), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) image = Image.open(image_path).convert('RGB') image = transform(image) image = image.unsqueeze(0) # Add batch dimension return image def extract_embedding(model, image_tensor): model.eval() with torch.no_grad(): embedding = model(image_tensor.cuda()) return embedding.cpu().numpy() def euclidean_distance(embedding1, embedding2): return np.linalg.norm(embedding1 - embedding2) class Args: def init(self):

data

    self.source_dataset = 'market1501'
    self.target_dataset = 'none'
    self.batch_size = 64
    self.test_batch_size = 128
    self.workers = 4
    self.height = 256
    self.width = 256
    self.num_instance = 4
    # model
    self.arch = 'resnet50'
    self.pretrained = False
    self.reset_params = False
    self.genotypes = 'msmt'
    # loss
    self.margin = 0.3
    self.sam_mode = 'none'
    self.sam_ratio = 2.0
    # optimizer
    self.optim = 'sgd'
    self.lr = 0.065
    self.weight_decay = 5e-4
    self.momentum = 0.9
    self.milestones = [150, 225, 300]
    self.warmup_step = 10
    # training configs
    self.resume = ''
    self.evaluate = False
    self.epochs = 350
    self.seed = 0
    self.print_freq = 100
    self.eval_interval = 40
    # misc
    self.data_dir = './data'
    self.logs_dir = './logs'
    self.pretrain_dir = './pretrained'

def main(): args = Args() seed = args.seed random.seed(seed) np.random.seed(seed) torch.manual_seed(seed) cudnn.deterministic = True cudnn.benchmark = True sys.stdout = Logger(os.path.join(args.logs_dir, 'log.txt')) print('Running with:\n{}'.format(args)) num_classes = 751 # Set number of classes according to your dataset model = msinet_x1_0(args, num_classes) print('Model Params: {}'.format(count_parameters(model))) model = model.cuda()

pretrained_weights = os.path.join(args.pretrain_dir, 'msinet_msmt.pth.tar')
if os.path.isfile(pretrained_weights):
    checkpoint = torch.load(pretrained_weights)
    copy_state_dict(checkpoint['state_dict'], model)
else:
    print(f"No pretrained weights found at {pretrained_weights}")

# Ask user for input image
 input_image_path = "./bb/0031_c1s1_002576_04.jpg"
# Load input image and extract its embedding
input_image_tensor = preprocess_image(input_image_path, args.height, args.width)
input_embedding = extract_embedding(model, input_image_tensor)
# Directory containing images
image_directory = './bb'  # Change this to your image directory
# Load other images and extract embeddings
image_paths = [os.path.join(image_directory, f) for f in os.listdir(image_directory) if f.endswith('.jpg')]
embeddings = []
for image_path in image_paths:
    image_tensor = preprocess_image(image_path, args.height, args.width)
    embedding = extract_embedding(model, image_tensor)
    embeddings.append((embedding, image_path))
# Calculate Euclidean distances between input embedding and other embeddings
distances = [(i, euclidean_distance(input_embedding, embedding[0])) for i, embedding in enumerate(embeddings)]
# Sort distances
distances.sort(key=lambda x: x[1])
# Display top 20 images
fig, axes = plt.subplots(5, 4, figsize=(15, 15))
for i in range(5):
    for j in range(4):
        if i == 0 and j == 0:
            # Display input image
            img = Image.open(input_image_path).convert('RGB')
            axes[i, j].imshow(img)
            axes[i, j].set_title("Input Image")
        else:
            # Display other images
            idx = i * 4 + j - 1
            if idx < len(distances):
                img = Image.open(image_paths[distances[idx][0]]).convert('RGB')
                axes[i, j].imshow(img)
                axes[i, j].set_title(f"Rank {idx + 1}\nDistance: {distances[idx][1]:.2f}")
            axes[i, j].axis('off')
plt.tight_layout()
plt.show()

if name == 'main': main()`

vimar-gu commented 4 months ago

Thanks for the interest. The provided weights are pre-trained on ImageNet. So yes, you do need to fine-tune the model on your own dataset. And for customized datasets, you can create a new python file to make it similar to what exist now in the reid/data folder. And then the training script in this repo can be directly utilized. Let me know if there are more problems on this.

bilal6414 commented 4 months ago

Thanks for the interest. The provided weights are pre-trained on ImageNet. So yes, you do need to fine-tune the model on your own dataset. And for customized datasets, you can create a new python file to make it similar to what exist now in the reid/data folder. And then the training script in this repo can be directly utilized. Let me know if there are more problems on this.

Thank you so much for your prompt response, I surely will train it on custom dataset. Is it possible you can share weights that you are trained on VehicleID dataset. As I want test on vehicles and bikers images.

vimar-gu commented 4 months ago

I'm sorry but currently I don't have any trained models on Re-ID datastes. If you want to use the model for vehicles and bikers re-identification, it is better to construct a dataset more similar to the actual scenarios. Although MSINet has improved the generalization by a large margin, the direct cross-domain performance is still relatively poor. My another work on continual Re-ID helps improve the generalization by a color distribution shuffle operation, which might also be useful for you. Please refer to https://github.com/vimar-gu/ColorPromptReID/blob/57ed2ac17c5239542a426818051cb588defa4b42/reid/trainers.py#L41

bilal6414 commented 3 months ago

Thank you so much ! I have retrained the model on my own dataset which has 851 clsseses, these picture are of bikers currently I used MSInet . these are results => Computing DistMat with euclidean distance Validation Results - Epoch[349] mAP: 82.4% CMC curve, Rank-1 :75.0% CMC curve, Rank-5 :100.0% CMC curve, Rank-10 :100.0%

I just want to make infrence to match embedings , can you please let me know or you share a simple script which will give embedings and eculean distance between two embedings. Currently I am trying but I got several errors , size miss match which is I think becasue of class miss match,,, and like these

vimar-gu commented 3 months ago

You can refer to the code in reid/utils/metrics.py: https://github.com/vimar-gu/MSINet/blob/2a8845b6b3d1a3b8baeb864b92f9423c2dc711ee/reid/utils/metrics.py#L131-L136 Here the distance between groups of features are calculated. The distance calculation is not related to classes, where the features should be in the shape of [sample_number]*[feature_len]. The matrix calculation operation indeed is kind of tricky. Try figuring it out by using different operations :)