Closed Patrick-Wen closed 6 months ago
Hi @Patrick-Wen
I see your point - PyTorch's documentation says it is a "generic data loader" and it may be confusing. It is indeed possible to interpret it as data loader, since it's just retrieving data organized in a certain way (one folder for class), but it is not a DataLoader
(as in the class definition).
If we look at the classes, ImageFolder
inherits from DatasetFolder
which is inherited from VisionDataset
. These classes implement the typical dataset methods: __getitem__
, __init__
, and __len__
.
In PyTorch's own "Datasets & DataLoaders" it states: "A custom Dataset
class must implement three functions: __init__
,__len__
, and __getitem__
"
An actual DataLoder
does not implement __getitem__
, it is an iterator so it implements __iter__
instead. In the documentation it states that "Data loader combines a dataset and a sampler, and provides an iterable over the given dataset."
So, while the ImageFolder
only "loads data" (and that's why I think they called it a generic data loader), it is not a true DataLoader
(which you use to create mini-batches) because it is implemented as a Dataset
(which you use to retrieve individual data points).
I hope this helps!
Best, Daniel
Here is a description of ImageFolder in Page 419 of the book:
In contrast to the above description of ImageFolder as a "generic dataset", the PyTorch user manual describes ImageFolder as "A generic data loader where the images are arranged in this way by default ...". I feel ImageFolder functions like TensorDataset but is specifically for image data. I am not sure whether to call it a generic dataset or generic data loader.
All I know about PyTorch is from this book for the moment and I am just on my way of learning. Any clarification is appreciated. Thank you.