pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.98k stars 6.92k forks source link

Standardisation of Dataset API split argument name #1067

Open RJT1990 opened 5 years ago

RJT1990 commented 5 years ago

I've noticed some naming inconsistencies across the torchvision datasets when it comes to specifying how to split the dataset (train/val/test). We currently have:

The rest are unspecified - but you can effectively choose the split in them by choosing the root folder (e.g. for COCO).

Is there a reason for different naming conventions for each? If not, is there a case for standardising the argument name to one of the above so it's consistent?

fmassa commented 5 years ago

Hi,

Great question!

Those inconsistencies are mostly because we didn't impose any particular structure to the datasets that were added to torchvision.

While this makes it very simple to understand what is going on, it also leads to those inconsistencies. The split argument is only one of them, but we also have datasets that store a classes attribute, etc.

I think it might be worth think about standardization, but I'm less clear on how it should be structured, as each dataset is slightly different, so a single API might not be enough, even if they are similar.

One initial thought I had was to have a ClassificationDataset, see my comment in https://github.com/pytorch/vision/pull/1025

Thoughts?

pmeier commented 5 years ago

@fmassa Why would this be specific to a ClassificationDataset? Assuming that it is, I can further think of the classes and class_to_idx parameters that should be included. If we want a ClassificationDataset I would like to take that up.

fmassa commented 5 years ago

This is not specific to a ClassificationDataset, but enters in the same bucket of standardization that I mentioned wrt ClassificationDataset.

@pmeier can you open an issue describing a proposed design for the ClassificationDataset, and we can iterate over it? No need to implement anything, just describe what would be inside it, and what datasets would fit into this abstraction.