Is padding being done to convert all image tensors to same size before passing through the model?

❓ How to do something using DETR

I'm trying to train on my custom dataset, and when I print the dimensions of the nested_tensor before passing through the backbone, it seems like padding is not being done, I'm receiving some random height and width of the tensors (maybe because of transforms like random_resize). Is the code intentionally written this way?

I have printed out tensor_list.tensors.size() for dimensions before passing the nested tensor to the resnet(backbone) and also printed out src.size() for dimensions of the tensor after resnet and before passing to the transformer.

batch size is 2, the input channels is 3, output channels after backbone resnet-50 is 2048, as expected.

The output is as follows

Were these random dimensions of the tensor intentional?

facebookresearch / detr

Is padding being done to convert all image tensors to same size before passing through the model? #525

❓ How to do something using DETR