jwdink / eyetrackingR

This package is designed to make dealing with eye-tracking data easier. It addresses tasks along the pipeline from raw data to analysis and visualization.
http://eyetrackingr.com
Other
82 stars 20 forks source link

`analyse_time_clusters` incorrectly computing cluster p-values #66

Open respatte opened 6 years ago

respatte commented 6 years ago

Hi everyone,

I have recently tried using bootstrapped cluster-based permutation analysis on my data, using eyetrackingR great functions, but with no luck finding any significant results. I first thought this was just because my data didn't provide enough evidence, but I recently stumbled across some similar analysis that found significant differences where visually it seemed that the difference was a lot less clear.

This led me to question the output of analyse_time_clusters, and there is a couple issues with it. First, looking at the summary of analyse_time_clusters, it gives me the same value for the null distribution mean and 2.5% (97.5%) percentiles, which is in clear conflict with the plot for the same analysis result: image

Second, the Probability value from the summary of analyse_time_clusters is always 1, in all the analyses I conduct on whichever dataset. This is quite surprising as the $p$-value should be $0$ if the code believes that the null distribution is basically a dirac. So clearly, there must be an issue, I'm just not sure where.

This issue is likely related to #63 where there was an issue running analyse_time_clusters with certain combinations of parameters for within_subj and treatment_level. This is what my code looks like, and as for the other issue you can find my full code and datasets on my project's repository (stats/InfantsAnalysis, see full instructions on other issue).

LT.time_cluster_tail <- LT.time_course_tail %>%
  split(.$FstLst) %>%
  lapply(make_time_cluster_data,
         predictor_column = "Condition",
         treatment_level = "No Label",
         aoi = "Tail",
         test = "lmer",
         threshold = 1,
         formula = ArcSin ~ Condition +
           (1 | Participant) +
           (1 | Stimulus))
## Run analysis
LT.time_cluster_tail.analysis <- LT.time_cluster_tail %>%
  lapply(analyze_time_clusters,
          formula = ArcSin ~ Condition +
           (1 | Participant) +
           (1 | Stimulus),
         within_subj = T,
         parallel = T)
> summary(LT.time_cluster_tail.analysis[[2]])
Test Type:   lmer 
Predictor:   Condition 
Formula:     ArcSin ~ Condition + (1 | Participant) + (1 | Stimulus) 
Null Distribution   ====== 
 Mean:       26.5412 
 2.5%:       26.5412 
97.5%:       26.5412 
Summary of Clusters ======
  Cluster Direction SumStatistic StartTime EndTime Probability
1       1  Positive     1.027932      2800    2850           1
2       2  Positive    26.541206      3100    4000           1
3       3  Negative    -1.117454      2000    2050           1
respatte commented 6 years ago

I just realised that I could use the dataset provided in the package to make better minimal working examples, and so I tried to do it for this issue. Funnily, this made me find out where exactly the problem seems to be:

Hope this helps identifying and fixing the issue. Here is the full code that I ran to test all this: eyetrackingR_issue