RaivoKoot / Video-Dataset-Loading-Pytorch

Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
BSD 2-Clause "Simplified" License
445 stars 43 forks source link

Using a dataset with different widths and heights for each frame #18

Closed omarinio closed 2 years ago

omarinio commented 2 years ago

Hello,

My dataset is pre-processed and takes frames of a video and crops out a detection from the video. This means my dataset has frames of slightly different sizes due to the bounding boxes being different. Due to this, I am getting an error when using the ImglistToTensor() function. The error being:

'RuntimeError: stack expects each tensor to be equal size, but got [3, 105, 111] at entry 0 and [3, 109, 115] at entry 1'

I try to initially resize them all before using this function but I get a type error:

'TypeError: img should be PIL Image. Got <class 'list'>'

I'm unsure if there is anything I can do to still make use of this custom dataset as it is just what I need for my project.

RaivoKoot commented 2 years ago

No problem. Just before the line below, you probably want to resize the image before appending it to the return list. The image object on that line should be a PIL Image object. So, any function that can resize PIL Images should work. https://github.com/RaivoKoot/Video-Dataset-Loading-Pytorch/blob/d0903e07f527e3c8642d10c5ba6fe8b42e92f0da/video_dataset.py#L235

omarinio commented 2 years ago

Thank you, everything seems to work now!