pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.18k stars 6.95k forks source link

Error while using RandomResizedCropVideo #1385

Open ekosman opened 5 years ago

ekosman commented 5 years ago

I started using the new video transformations by downloading the video transformations source code (seems like it doesn't appear when trying to upgrade via pip).

The transformation I'm trying to use is:

def build_transforms():
    mean = [0.485, 0.456, 0.406]
    std = [0.229, 0.224, 0.225]
    res = transforms.Compose([transforms_video.ToTensorVideo(),
                              transforms_video.RandomResizedCropVideo(224),
                              transforms_video.RandomHorizontalFlipVideo(),
                              transforms_video.NormalizeVideo(mean=mean, std=std)
                              ])

    return res

This composition raises an error while trying to transform a video clip (It doesn't happen if I remove transforms_video.RandomResizedCropVideo(224)):

TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ekosman/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ekosman/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ekosman/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/workdisk/action_start_detection/loaders/torch_data_loader_no_bg.py", line 284, in __getitem__
    video = self.transform_clip(video)
  File "/workdisk/action_start_detection/loaders/torch_data_loader_no_bg.py", line 157, in transform_clip
    clip = self.transform(clip)
  File "/home/ekosman/anaconda3/envs/torch/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
    img = t(img)
  File "/workdisk/action_start_detection/utils/transforms_video.py", line 78, in __call__
    i, j, h, w = self.get_params(clip, self.scale, self.ratio)
  File "/home/ekosman/anaconda3/envs/torch/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 638, in get_params
    area = img.size[0] * img.size[1]
TypeError: 'builtin_function_or_method' object is not subscriptable
ekosman commented 5 years ago

Ok, ended up implementing my own RandomResizedCropVideo transform and now it works, I'll try looking for the problem tommorrow

fmassa commented 5 years ago

Note that RandomResizedCropVideo is not yet officially supported, and there will be BC-breaking changes with it before the next release.

ekosman commented 5 years ago

Ok, thanks for the reply. BTW, I think more transformation should be added (e.g. RandomRotation, Affine etc...). Another cool idea for video transformation would be slow-motion (or speed up). However, I think it would be too complicated with the current input

fmassa commented 5 years ago

@ekosman more spatial transformations (like rotation, affine etc) will be added. Temporal transforms (slow-motion e.g.,) have not yet been considered, but it might deserve a separate issue for discussion.