Open dakshjotwani opened 5 years ago
@dakshjotwani I think we should keep the original data that we have downloaded from the website.
Any cropping or other options should be done on the fly on the dataset, and can be selected via arguments in the constructor.
Thoughts?
Yeah I agree. That would be more consistent with the format of the other datasets.
I was also preprocessing the dataset to make it more manageable in size (8 GB cropped vs 36 GB original). Maybe we could have a preprocess helper function for users that might want to use it.
I shall send a PR for VGGFace2 soon.
was also preprocessing the dataset to make it more manageable in size (8 GB cropped vs 36 GB original). Maybe we could have a preprocess helper function for users that might want to use it.
Let's just leave the unprocessed file as is, I think it's fine.
@fmassa how would you like to go about this?
Currently, I preprocess VGGFace2 by cropping out the face from each image using the provided bounding box csv. After doing that, I load it as an ImageFolder dataset.
VGGFace2 also has a bunch of other CSVs that annotate gender, age group, and facial landmarks.
I propose we allow users to specify the targets they want (such as ID, gender, age, bounding box etc.) and have a
crop_bb=False
if they only want the face.What do you think?