Error in preprocess_data script or the dataset has been changed?

pkdogcom commented 6 years ago

I have downloaded the dataset as was trying to preprocess the data using the script. However, there seems to be at least two errors happen in the script which make me wonder if the dataset has been changed:

In the script the root folder of the evaluation subset is set as
```
evaldir = os.path.join(args.dataset, 'Evaluation Subset')
```
while the dataset I downloaded actually has two sub-folders "annotation for face image" and "sample list for eye image", each of which contains annotations in different format. I have to change the definition of the path to "sample list for eye image" (the other sub-folder doesn't work).
```
evaldir = os.path.join(args.dataset, 'Evaluation Subset', 'sample list for eye image')
```
More importantly, the normalized eye patches data in 'Data/Normalized' folder doesn't seem to match the original images in 'Data/original'. For example, in 'Data/original/p02/day02' there are 800 images while in 'Data/Normalized/p02/day02.mat' there are only 420 images. Since the evaluation set for 'p02/day02' does use image '0726.jpg' which is missing in the normalized mat data, the preprocess_data script will fail at
```
index = np.where(filenames[day] == row.filename)[0][0]
```

Are the errors due to different versions of the dataset that was used by this repo v.s. that is available now? Do I have to regenerate the 'Normalized' data to get it work?

hysts commented 6 years ago

Thanks for letting me know. It seems to be the case. I downloaded the dataset once again, and found out that it's not the same as before. Apparently, the paper was updated in November 2017 (https://arxiv.org/abs/1711.09017), and it says extended annotation was added to the dataset. As for the broken file in the dataset, please contact the authors of the paper (which I'm not just in case).

pkdogcom commented 6 years ago

I've double checked the dataset and found that there are not only mismatches between the original images and annotations as well as normalized images (they can simply contain different number of examples for the same subject in the same day), but also corrupted values in the annotation files. I've emailed the authors and hopefully they will have some fixes asap. I'll let you know once I have any updates.

hysts commented 6 years ago

Got it. Thanks!

pkdogcom commented 6 years ago

The author just uploaded a corrected version of dataset. I've re-run the preprocessing and training and everything looks great!

hysts commented 6 years ago

Great! Glad to hear that.

hysts / pytorch_mpiigaze

Error in preprocess_data script or the dataset has been changed? #1