pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.06k stars 6.93k forks source link

toy video datasets for generative models #2676

Open saeed1262 opened 4 years ago

saeed1262 commented 4 years ago

🚀 Feature

Adding toy video datasets for generative models such as Moving MNIST and Robonet

Motivation

The currently available video datasets are suitable for recognition and not synthesis. It would be nice if you add simple toy video datasets such as Moving MNIST.

cc @pmeier

pmeier commented 4 years ago

Hi @saeed1262, we are currently in the process of defining some guidelines for adding datasets (https://github.com/pytorch/vision/pull/2663). Right now the only metric we use is the number of citations:

Based on this I think Robonet is not yet mature enough (released in late 2019) to be added. Moving MNIST on the other hand seems suitable. @fmassa ?

saeed1262 commented 4 years ago

Hi @pmeier , BAIR (https://sites.google.com/view/sna-visual-mpc/) would be another option.

vfdev-5 commented 4 years ago

@saeed1262 thanks for the feature request ! After discussing this with @fmassa , we would like to add Moving MNIST dataset. Please, discuss with @pmeier if you would like to help us and send a PR.

Concerning Robonet and BAIR dataset, they seem to be very specific for robotics and probably not yet mature (considering the number of citations as a critrion). I wonder if those datasets can't be opened with existing generic dataset classes like ImageFolder ?

saeed1262 commented 4 years ago

Thanks, @vfdev-5 and @pmeier.

I have not worked with BAIR dataset yet. Apparently it's been used for generative and predictive methods (such as this one https://arxiv.org/abs/1903.01434). But I think starting with Moving MNIST would be a safe start on such datasets.