I have trained the OSNet_1x model on the Market dataset and successfully converted it to an ONNX file with a single batch size. The model works as expected when generating embeddings for the same person, with the Euclidean distance confirming the match (e.g., a distance of 15.36). However, when I convert the model to higher batch sizes, such as 14 or 100, using the same set of images, the results are significantly different. The distance increases drastically (e.g., 247.5) and suggests incorrect embeddings.

I’ve tested it with multiple image sets, and in every case, the single-batch ONNX produces correct results with small distances for matching images, while the higher batch size ONNX returns random embeddings with incorrect distances, regardless of whether the images match or not. Following is the code I used to convert into ONNX. `f``` rom torchreid import models import torch import torch.onnx from torchreid.utils import ( check_isfile, load_pretrained_weights, compute_model_complexity )

OSNet = models.build_model('osnet_x1_0', 1,pretrained="model.pth.tar-60", loss='softmax') load_pretrained_weights(OSNet, "model.pth.tar-60")

OSNet.eval()

OSNet.load_state_dict(torch.load("model.pth.tar-60"))

dynamic_axes = { 'input': { 0: 'batch' }, 'output': { 0: 'batch' } }

onnx_path = 'osnet_1x_trained_100b_new.onnx' torch.onnx.export(OSNet, # model being run torch.randn((14,3,256,128)), # model input (or a tuple for multiple inputs) onnx_path, # where to save the model (can be a file or file-like object)

export_params=True, # store the trained parameter weights inside the model file

              opset_version=12,          # the ONNX version to export the model to
              do_constant_folding=True,  # whether to execute constant folding for optimization
              input_names = ['input'],   # the model's input names
              output_names = ['output'],
              dynamic_axes=dynamic_axes
               # the model's output names
              )

KaiyangZhou / deep-person-reid

The ONNX model with higher batch sizes is producing incorrect embeddings. #585

OSNet.load_state_dict(torch.load("model.pth.tar-60"))

export_params=True, # store the trained parameter weights inside the model file