Open uha225 opened 1 week ago
Hi, the validation images need to be placed in labeled subdirectories to match the structure used for training data. However, you don't need to organize the validation images manually. There are many publicly available scripts online that can automate this process for you.
You should change the --data_dir and --new_dir arguments of process_dataset.py to point to the paths of your validation images and the desired output directory for the processed images.
When downloading and extracting ImageNet validation images, they are not organized into labeled directories as with training data (i.e., directories starting with ‘n’ for each class label are absent). This differs from the structured format of the training images. Process_dataset.py in pd.sh applies preprocessing operations only to training images, leaving validation images unprocessed.
Questions and Clarification Needed:
Should validation images be placed in labeled subdirectories manually, similar to the training structure? Is there an expected modification to process_dataset.py to preprocess validation images as well? Steps to Reproduce:
Download validation data from ImageNet [https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar]. Extract data, and observe that no label directories are created in val.