pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.32k stars 6.97k forks source link

Got Error when an input image with specific size was feed to the detection module. #959

Closed uestc7d closed 5 years ago

uestc7d commented 5 years ago

Environments

Ubuntu16.04
Python 2.7
PyTorch 1.1.0
TorchVision 0.3.0

How to reproduce

Upgrade torchvision version to 0.3.0, run the following code with python 2.7:

import torchvision
import torch

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.eval()
x = [torch.rand(3, 900, 1000)]
predictions = model(x)

Got Error message:

Traceback (most recent call last):
  File "detection-test.py", line 9, in <module>
    predictions = model(x)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torchvision/models/detection/generalized_rcnn.py", line 47, in forward
    images, targets = self.transform(images, targets)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torchvision/models/detection/transform.py", line 45, in forward
    images = self.batch_images(images)
  File "/usr/local/lib/python2.7/dist-packages/torchvision/models/detection/transform.py", line 110, in batch_images
    pad_img[: img.shape[0], : img.shape[1], : img.shape[2]].copy_(img)
RuntimeError: The expanded size of the tensor (864) must match the existing size (888) at non-singleton dimension 2.  Target sizes: [3, 800, 864].  Tensor sizes: [3, 800, 888]
Navifra-Kerry commented 5 years ago

The model has a minimum maximum input range. Check it out. This can be set when training the model.

fmassa commented 5 years ago

Fixed via https://github.com/pytorch/vision/pull/960