dmirman / gazer

Functions for reading and pre-processing eye tracking data.
44 stars 11 forks source link

How to find the peak pupil size and a possible bug in remove missing data? #29

Closed rongruchen closed 3 months ago

rongruchen commented 8 months ago

Hi Jason,

Thanks for developing the great package :) I ran into two issues and I would be grateful if you could help me with that.

For my experiment, I want to get the peak pupil size in each trial and remove those greater or less than 3SD of the personal mean. I am just wondering if there is a function that exists in the package that could help me find the peak pupil size (and to winsorize, ideally) in a trial directly.

Also, I have another question about the package not excluding the participant that has the percentage of missing trials over the missing threshold I specified. For example, in the below screenshot, I set the missing threshold as 0.2. However, subject 4 has 23.6% of missing trials but was not taken out, which was a bit confusing. Screenshot 2023-12-07 at 15 37 33

Many thanks for your help in advance!

Best, Rongru

jgeller112 commented 8 months ago

Hi Rongru,

For my experiment, I want to get the peak pupil size in each trial and remove those greater or less than 3SD of the personal mean. I am just wondering if there is a function that exists in the package that could help me find the peak pupil size (and to winsorize, ideally) in a trial directly.

There is no function for this, but it is pretty simple to do:

data %>%
group_by(subject, trial) %>%
summarize(max_pupil=max(pupil)# get max pupil size by subject and trial %>%

Also, I have another question about the package not excluding the participant that has the percentage of missing trials over the missing threshold I specified. For example, in the below screenshot, I set the missing threshold as 0.2. However, subject 4 has 23.6% of missing trials but was not taken out, which was a bit confusing.

I think there is a misunderstanding here of how the function works. If a trial has > .2 missing data, the trial is thrown out. In your case, this resulted in 23% of total trials being removed. The subject-level criteria is calculated across all trials. Thus, If a subject is missing data greater than the threshold then it is thrown out. Make sense?

rongruchen commented 8 months ago

Hi Jason,

Many thanks for your reply! Now I know how to calculate the maximum pupil size with R :)

For the missing data criteria, I wonder if you mean that the threshold specified of missing data applies to both trial-level and subject-level - so if you set the missing threshold as 0.2, then a trial with more than 20% of missing data will be removed, and a participant with more than 20% of missing trials will also be thrown out. Am I correct about that?

Actually, the 23.6% of missing trials in my screenshot apply to only one subject - which is subject 4. (Because I set a filter with the condition subject == "4") instead of all subjects (for which the percentage of trials taken out were 4.2%).

So if my understanding of how the function works is correct, then subject 4 should be thrown out?

Thank you for your time and patience!

Best, Rongru

jgeller112 commented 8 months ago

Hi,

Here is how both are calculated within the function:

  countsbysubject <- datafile %>%
    dplyr::group_by(subject) %>%
    dplyr::summarise(missing = sum(is.na(!!sym(pupil))),
                     samples = sum(!is.na(!!sym(pupil))),
                     total = length(pupil)) %>%
    dplyr::mutate(averageMissingSub = missing / total)

  countsbytrial <- datafile %>%
    dplyr::group_by(subject, trial) %>%
    dplyr::summarise(missing = sum(is.na(!!sym(pupil))),
                     samples = sum(!is.na(!!sym(pupil))),
                     total = length(pupil))%>%
    dplyr::mutate(averageMissingTrial = missing / total)

I have applied it to data I have and it works as expected.