Tobias-Fischer / rt_gene

RT-GENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments
http://www.imperial.ac.uk/personal-robotics
Other
365 stars 68 forks source link

Removed duplicated rows in dataset that are causing sampling bias #57

Closed ahmed-alhindawi closed 4 years ago

ahmed-alhindawi commented 4 years ago

As per #56 The cleaning was done using the following R code (sorry, it was easier for me!)

library(tidyverse) files <- list.files(path="/home/ahmed/catkin_ws/src/rt_gene/rt_bene_dataset", pattern="*.csv", full.names=TRUE, recursive=FALSE) lapply(files, function(x) { df <- read.csv(x, header=FALSE) df_clean <- df %>% group_by(V1) %>% summarise(V2_clean = mean(V2)) %>% filter(V2_clean == 0.0 | V2_clean == 1.0) write.table( df_clean, file=x, sep=",", col.names=FALSE, row.names=FALSE) })

I would suggest that the networks would need re-training on this dataset.

@Twarz could you have a look please?

Tobias-Fischer commented 4 years ago

See #56 - let's continue discussion there. I'll close this PR for the meantime. Also note that only https://github.com/Tobias-Fischer/rt_gene/pull/57/commits/8e68ec72ea8c8b46c3c22c7e034b5ee4c14ceadb seems to be relevant for this PR.

KevinCortacero commented 4 years ago

Hey @ngageorange , Now that the files are fixed, I launch the new training on beginator :)