RT_BENE Dataset left/right folder

Tobias-Fischer / rt_gene

RT-GENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments

http://www.imperial.ac.uk/personal-robotics

Other

365 stars 68 forks source link

RT_BENE Dataset left/right folder #55

Closed ahmed-alhindawi closed 4 years ago

ahmed-alhindawi commented 4 years ago

Hi all, Just wondering about the dataset used for the RT_BENE training - The README and the paper stipulate that the "sXYZ_noglasses" public dataset is used rather than the "sXYZ_glasses" section. However, the "sXYZ_noglasses" dataset doesn't have left/right eye patches/folders and only has face images for the GAN training of RT_GENE.

Does that mean left/right patches were extracted from the face? If so, which landmark extractor was used?

Tobias-Fischer commented 4 years ago

Hi @ngageorange, Good point. I forgot that I extracted the left/right images using the RT-GENE pipeline.

@Twarz: Please let me know where the dataset is located nowadays (icubunicorn / beginator?) and I can zip the eye patch images so they can be easily used.

ahmed-alhindawi commented 4 years ago

Sorry to be a pain again; I've extracted the left/right patches using the RT-GENE pipeline. As expected, there are some images where the eye patches can't be extracted due to occlusions/hidden views.

The issue is that they are somehow still labeled in the csv files, meaning that when the "train_blink_model.py" is called, there are many Error can't read pair... errors.

The training itself continues on 137,616 samples. Are those numbers correct or should more samples be present? The paper says "we labelled in total 243,714 images" so I'm confused....

KevinCortacero commented 4 years ago

Hey @ngageorange , The 243K images are distributed in 17 subjects. So if you sum all the CSV labels (left eyes) and multiply by 2 (to count right eyes) then you get the 243K images. During the training we use 3 folds, each fold is trained with 2 groups of subjects and validated on another group: fold 1 is trained with {grp0, grp1}, validated on {grp3} fold 2 is trained with {grp0, grp2}, validated on {grp3} fold 3 is trained with {grp1, grp2}, validated on {grp3}

so if you sum {grp0, grp1, grp2, grp3} and multiply by 2 (1 sample = 2 images) you will almost get 243K. The remaining difference is explained by:

s006 is discarded from the training/validation = 7710*2 images
uncertain labels are not taken into account during training

I hope this will help you. Also, @ngageorange and @Tobias-Fischer , I send you the path to the sorted images by mail.

Tobias-Fischer commented 4 years ago

I suggest to resolve this issue by creating a tar file from the images in the path that @Twarz sent to us, and upload the images somewhere (zenodo probably). I'll do that on Monday.

Tobias-Fischer commented 4 years ago

Fixed in https://github.com/Tobias-Fischer/rt_gene/commit/eea454680c6a73bcd3225e70a1c1887c33b6738e and https://github.com/Tobias-Fischer/rt_gene/commit/f935f4162cb3af6de7d52a0d352c829c86c4cc9f

@ngageorange Let us know if you have any issues.