Closed hyunW3 closed 5 months ago
Hello @hyunW3,
the full dataset must be downloaded manually, see section Download Full Dataset With Google Storage Transfer. Note that the tsv files contain image urls, so Google Storage Transfer is not a strict requirement. Caution: 18TB.
From there, please familiarize yourself with the Open Images V6 data loader:
"Since Open Images puts all data into a single folder, it is expected that the user has already created a text file _list_trainfiles.txt with all the images of the split prior to instantiating this class (otherwise an os.walk
takes forever)."
Note that fiftyone only provides a subset (1.7M):
"Open Images V6 is a dataset of ~9 million images, roughly 2 million of which are annotated and available via this zoo dataset.", see ref. A similar subset is supported in our tutorial; simply replace
--dataset_name="clic" \
with --dataset_name="open_images_v4" \
.
Hope this helps, Nikolai
Thank you for your kind and rapid reply! However, 18TB is too large for me... I'm looking forward your pre-trained model :)
Thank you for great work!
I want to know how to download openimagev6 and how to obtain "list_train_files.txt" files in openimagev6 dataset
I'm currently working with training phase following notebooks/PerceptualCompression.ipynb. Since I'm not familiar with the OpneImageV6 dataset, I use fiftyOne package to download OpenImageV6 following official openImage webpage
However, the training phase returns error with
However, with fiftyOne download, there is no list_train_files.txt. The below image is regarding dataset hierarchy
Thank for your help