tmbdev-archive / pytorch-imagenet-wds

25 stars 4 forks source link

What is the purpose of the library, if you have to download the whole dataset? #2

Closed ghost closed 3 years ago

ghost commented 3 years ago

First of all, what is a "shard"? Why do I need to "shard" a dataset? What is this about, where is the documentation for that?

Second, how can a library called "WEB Dataset" demand that you download the dataset in order to "shard" it and only then use it?