pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.29k stars 6.96k forks source link

FasterRCNN Bounding box error #2219

Closed chkda closed 4 years ago

chkda commented 4 years ago

🐛 Bug

I am using a model for Faster RCNN torchvision.models.detection. The problem is the value of the target bounding boxes keeps getting altered The operation being performed is just a simple forward pass where loss is being evaluated.

To Reproduce

Steps to reproduce the behavior:

class CustomDataset(torch.utils.data.Dataset):

    def __init__(self,xtr,ytr):

        self.xtr = xtr
        self.ytr = ytr

    def __getitem__(self,idx):

        img = self.xtr[idx]
        tar = self.ytr[idx]

        return img,tar

    def __len__(self):

        return len(self.xtr)

def collate_fn(batch):
    return list(zip(*batch))

def from_numpy_to_tensor(images,labels_list):

    images = torch.from_numpy(images).cuda()
    for label in labels_list:
        label["boxes"] = torch.from_numpy(label["boxes"]).cuda()
        label["labels"] = torch.from_numpy(label["labels"]).cuda()

    return images,labels_list

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False,num_classes=4)
model.to(device)
optimizer = optim.Adam(model.parameters(),lr=0.000001)

## trying with one image only
x_train = images[0:1]   
y_train = labels[0:1]

## Converts numpy to TorchCudaFloat 
x_train,y_train = from_numpy_to_tensor(x_train,y_train)

## DataLoader object 
dataset = CustomDataset(x_train,y_train)
dataloader = DataLoader(dataset,batch_size=1,collate_fn=collate_fn)

## Iterations
for i in range(20):
    print("Iter No:",i)
    for xtr,ytr in dataloader:
        ytr = list(ytr)
        print(ytr)
        output = model(xtr,ytr)

This is what my labels tensor looks like at the start of the iteration

[{'boxes': tensor([[ 311.9933, 1013.7640,  719.6339, 1142.7417],
        [ 308.1646,  928.4176,  739.4443,  961.6580],
        [ 308.7562,  830.7968,  740.0359,  864.0373],
        [ 305.8657,  680.8315,  763.0424,  708.1243],
        [ 300.3259,  439.0691,  790.4506,  523.2395],
        [ 306.6932,  248.2031,  741.7596,  458.5648]], device='cuda:0'), 'labels': tensor([4, 3, 3, 3, 2, 1], device='cuda:0')}]

By the end of the 20th iteration the target becomes something like this

[{'boxes': tensor([[117.7318, 383.4572, 271.5563, 432.2433],
        [116.2870, 351.1749, 279.0319, 363.7481],
        [116.5102, 314.2498, 279.2552, 326.8230],
        [115.4195, 257.5252, 287.9367, 267.8488],
        [113.3290, 166.0784, 298.2794, 197.9159],
        [115.7318,  93.8831, 279.9056, 173.4527]], device='cuda:0'), 'labels': tensor([4, 3, 3, 3, 2, 1], device='cuda:0')}]

As you can see the targets are getting altered by huge margins in just 20 iterations.

There is a workaround for the problem. The following code snippet works fine. It doesn't alter the bounding box values.

for i in range(20):
    print("Iter No:",i)
    for xtr,ytr in dataloader:
        y_tr = [{k:v for k,v in t.items()} for t in ytr]
        output = model(xtr,y_tr)
        print(ytr)

In the images variable you can take a batch size of 1 image with the size (1280,842,3) and labels value is the one that is specified above.

Expected behavior

The ideal behaviour would be for the values of target boxes to remain unaltered.

Environment

PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: Could not collect

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: Tesla K80
GPU 1: Tesla K80
GPU 2: Tesla K80
GPU 3: Tesla K80
GPU 4: Tesla K80
GPU 5: Tesla K80
GPU 6: Tesla K80
GPU 7: Tesla K80

Nvidia driver version: 418.87.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip] numpy==1.18.1
[pip] torch==1.5.0
[pip] torchvision==0.6.0a0+82fd1c8
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.1.243             h6bb024c_0  
[conda] mkl                       2020.0                      166  
[conda] mkl-service               2.3.0            py37he904b0f_0  
[conda] mkl_fft                   1.0.15           py37ha843d7b_0  
[conda] mkl_random                1.1.0            py37hd6b4f25_0  
[conda] numpy                     1.18.1           py37h4f9e942_0  
[conda] numpy-base                1.18.1           py37hde5b4d6_1  
[conda] pytorch                   1.5.0           py3.7_cuda10.1.243_cudnn7.6.3_0    pytorch
[conda] torchvision               0.6.0                py37_cu101    pytorch
fmassa commented 4 years ago

Thanks for the bug report and detailed description! This should be fixed with https://github.com/pytorch/vision/pull/2227