Tobias-Fischer / rt_gene

RT-GENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments
http://www.imperial.ac.uk/personal-robotics
Other
365 stars 68 forks source link

RT_BENE CSV duplicates introducing sampling bias #56

Closed ahmed-alhindawi closed 4 years ago

ahmed-alhindawi commented 4 years ago

Hi all, I've noticed that there are duplicates in the CSV files related to training; an example of this is s000_blink_labels.csv row 13983 and 13984:

While RTBeneDataset.py removes indeterminately labeled samples (i.e. any label of 0.5 is discarded), the following row is labeled as 0.0 and is thus used for training.

I'm not sure how this affects results but I would suggest that the 0.5 and 0.0 labels be removed all together?

I'll sanitise the data personally and create a pull request but I thought I would converse here about it...

Tobias-Fischer commented 4 years ago

@Twarz: It seems something went wrong when creating the csv files. Could you please check and re-create the csv files properly without duplicates contained? It seems like every single "uncertain" sample (label 0.5) has subsequently another line with label 0.0.

@ngageorange: Regarding the pull request: 1) How the hell are people (including you) able to write R code? So cryptic :face_with_thermometer: 2) It removes all the uncertain labels, but someone might want to use them. So let's wait for @Twarz so provide a proper fix (which seems like just removing all lines that follow a 0.5 line).

ahmed-alhindawi commented 4 years ago

R is phenomenal, only lesser humans can't read it 😝

Yes it does remove all uncertain labels and then the 0.0 following it; the presumption here is that it's either 0.5 (thus will be discarded) or that it should be 0.5 as there is uncertainty between people and thus should be discarded anyway.

I can happily change the R code to keep a 0.5.

Tobias-Fischer commented 4 years ago

Up to @Twarz whether he wants to fix it.

KevinCortacero commented 4 years ago

Hey @ngageorange , @Tobias-Fischer, fix is on progress, should be pushed soon :)

Tobias-Fischer commented 4 years ago

@ngageorange: Can we remove the obsolete branch?