NDCLab / pepper-pipeline

tool | Python Easy Pre-Processing EEG Reproducible Pipeline
GNU Affero General Public License v3.0
3 stars 3 forks source link

Condition diff & signal to noise metric #325

Open F-said opened 2 years ago

F-said commented 2 years ago

Describe the feature to be tested Overall data reliability.

Which tests need to be made for the feature

@georgebuzzell @SMoralesPhD @trollerrenfr

SMoralesPhD commented 2 years ago

Apologies for the delay. Attached is an R script that brings in trial-level ERP data, organizes it, and computes condition differences, reliability, and SME. For now, it uses the ERN as an example, but it can be easily adapted to other ERPs. Given that we are not interested in doing reliability by increasing numbers of trials, I think using the recently published splithalf package would be the best option. It is way faster than my clunky functions. Please let me know if you have any questions.


title: "PEPPER Reliability" author: "Santi" date: "2/14/2021" output: html_document: toc: true fig_height: 8.5 fig_width: 12 css: custom 2.css editor_options: chunk_output_type: console

README: This script brings in trial-level ERP data, organizes it, and computes reliability and SME. For now, it uses an example of the ERN, but it can be easily adapted to other ERPs. It was created by Santiago Morales (smoralespam@gmail.com). For SM, this script is based on TOTS_ERN_Reliability.Rmd

Setup

list.of.packages <- c("psych", "zoo", "reshape2", "car", "ggplot2", "R.matlab", "tidyr","dplyr","foreach", "doParallel","effsize")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
# Loading packages
lapply(list.of.packages, require, character.only = TRUE)

# Setting my plotting options
my_opts <- list(theme_classic() + theme(axis.text=element_text(size=14), axis.title=element_text(size=15,face="bold"), legend.title=element_text(size=14,face="bold"), legend.text=element_text(size=13), strip.text.x = element_text(size = 14, face="bold")))
############################################################

# Options for parallel
registerDoParallel(4)  # use multicore, set to the number of our cores
opts <- list(chunkSize=2)

Reliability

ERN

# Creating a list of the matlab files 
tbt_data_path <- "/Volumes/hdshare/Dropboxes/moraless/TOTS/TOTS_12yr_Flanker/bin_200_0/"

matlab_list <- list.files(path = tbt_data_path, pattern = ".mat", ignore.case = T)
matlab_list <- matlab_list[grepl("FCz.mat", matlab_list)]

event_list <- list.files(path = tbt_data_path, pattern = ".csv", ignore.case = T)
event_list <- event_list[!grepl("Trials_per_conditionLong_contextFIXD_ALL", event_list)]

# Looping through each participant, reading in their data, formatting, and creating one dataset. 
# Note, for now, there is no need to worry about channels/clusters because it is only FCz
df <- data.frame()
for (i in 1:length(matlab_list)) { 
    mat_file_name <- matlab_list[i]
    chan <- sapply(strsplit(as.character(mat_file_name),"_"), `[`, 2)
    chan <- gsub(".mat", "", chan)
    id <- paste(sapply(strsplit(as.character(mat_file_name),"_"), `[`, 1), sep = "_")

    event_file_name <- event_list[grep(id, event_list)]
    print(paste0("Now in ", mat_file_name, "; ", event_file_name))

    matlabFile <- readMat(paste0(tbt_data_path,mat_file_name)) # Reading in matlab file
    eventFile <- read.csv(paste0(tbt_data_path,event_file_name), header = T, na.strings = "NaN") # Reading in event file
    df_temp <- as.data.frame(t(unlist(matlabFile$NewEpochs)))
    names(df_temp) <- paste0("v", seq(-500,996, by = 4)) # After renaming maybe I can just use gsub to create a time variable 
    df_temp$trial <- seq(1,nrow(df_temp))
    df_temp <- cbind(eventFile, df_temp)
    df_temp$chan <- chan
    df <- bind_rows(df, df_temp)
}

# Melting the data for exploratory plotting and analyzes
# Deleting some variables we do not need
df$latency <- NULL
df$urevent <- NULL
df$duration <- NULL
df$epoch <- NULL

df <- df[, !names(df) %in% c("type", "latency", "value", "duration", "codes", "init_index", 
"init_time", "urevent", "StimType", "PrevBlockFeed", "Block", "Congruency", "prevCongruency", "nextCongruency", "Direction", "prevDirection", "nextDirection", "Responded", "PrevResponded", "NextResponded", "prevAccuracy", "nextAccuracy", 
"RT", "prevRT", "nextRT", "ITI", "prevITI", "nextITI", "Matched", "Bad", "prevBad", "nextBad", "epoch","TrialNum_idx")]

vars <- c("Accuracy", "WhichStudy",  "id", "TrialNum", "trial","chan")
dfm <- melt(df, id=vars) # You should not get a warning here
dfm$time <- as.numeric(as.character(gsub("v", "", dfm$variable))) # You should not get a warning here 

# Limiting the time that I keep 
dfm <- dfm[dfm$time > -100 & dfm$time < 400,]

# Relabelling condition 
dfm$Condition <- ifelse(dfm$Accuracy=="0" & dfm$WhichStudy=="S", "S_Error", ifelse(dfm$Accuracy=="1" & dfm$WhichStudy=="S", "S_Correct", 
                        ifelse(dfm$Accuracy=="0" & dfm$WhichStudy=="NS", "NS_Error", ifelse(dfm$Accuracy=="1" & dfm$WhichStudy=="NS", "NS_Correct","ERROR!"))))

Creating plots

# Plotting
dfm.ag <- aggregate(cbind(value) ~ Condition + time + chan, dfm, function(x) mean(x, na.rm=T)) # Creating an averaged dataset

ggplot(dfm.ag[dfm.ag$chan=="FCz",], aes(x =time,y=value)) + xlab("Time") +  geom_rect(aes(xmin=0, xmax=100, ymin=-Inf, ymax=Inf), fill='gray85', alpha=0.01) + 
  geom_line(aes(colour=Condition), size=1) + ggtitle("FCz") + my_opts + theme(legend.position="bottom") + labs(x = "Time", y = expression(Amplitude~~to~~Response~~mu~V))

ggplot(dfm.ag[dfm.ag$chan=="FCz" & grepl("^NS_", dfm.ag$Condition),], aes(x =time,y=value)) + xlab("Time") +  geom_rect(aes(xmin=0, xmax=100, ymin=-Inf, ymax=Inf), fill='gray85', alpha=0.01) + 
  geom_line(aes(colour=Condition), size=1) + my_opts + theme(legend.position="bottom") + labs(x = "Time", y = expression(Amplitude~~to~~Response~~mu~V)) + scale_color_manual("",breaks = c("NS_Error", "NS_Correct"), values=c("red", "blue"), labels = c("Error", "Correct"))

# Plotting with difference score
dfm.ag.dif <- pivot_wider(data = dfm.ag, id_cols = c(time), names_from = c(Condition), values_from = value) %>% 
  mutate(NS_Diff = NS_Error - NS_Correct,
         S_Diff = S_Error - S_Correct) %>%
  melt(., id.vars = c("time"))

ggplot(dfm.ag.dif[grepl("^NS_", dfm.ag.dif$variable),], aes(x =time,y=value)) + xlab("Time") +  geom_rect(aes(xmin=0, xmax=100, ymin=-Inf, ymax=Inf), fill='gray85', alpha=0.01) + geom_line(aes(colour=variable), size=1) + my_opts + theme(legend.position="bottom") + labs(x = "Time Relative to Response (ms)", y = expression(Amplitude~~mu~V)) +
  scale_color_manual("",breaks = c("NS_Error", "NS_Correct", "NS_Diff"), values=c("red", "blue", "black"), labels = c("Error", "Correct","Difference")) 

Extracting

Selecting out the time and channel/cluster of interest

ERN <- dfm[dfm$chan=="FCz" & dfm$time >= 0 & dfm$time <= 100,]
ERN.ag <- aggregate(cbind(value) ~ id + Condition, ERN, function(x) mean(x, na.rm=T)) # For checking if things match to George's
ERN <- aggregate(cbind(value) ~ id + Condition + TrialNum + trial, ERN, function(x) mean(x, na.rm=T))
ERN$Condition <- car::recode(ERN$Condition, " 'Error'='Error_ERN';'Correct'='Correct_ERN'")

dfm.ag <- ERN[ERN$Condition=="NS_Correct" | ERN$Condition=="NS_Error",]

Reliability

These are my clunky functions, but they are pretty intuitive. They do increasing numbers of trials and overall. The actual function are run in the next chunk.

# Using parallel computing and only doing reliability, not effect sizes. 
splithalf_trials_parallel <- function(data, n_from, n_to, n_by, n_subsamples) { 
  # Eg., df_r <- splithalf_trials(dfm.ag, 2, 100, 5, 10)

  registerDoParallel(4)  # use multicore, set to the number of our cores
  opts <- list(chunkSize=2)
  results_df <- foreach(s = seq(n_from, n_to, n_by), .combine='rbind', .options.nws=opts) %:% 
  foreach(i = 1:n_subsamples, .combine='rbind') %dopar% { # Indicating how many subsamples
    # df_temp <- dfm.ag[dfm.ag$id == i, ]  
    seed = sample(1:10000000, 1)
    set.seed(seed)   ## set the seed to make your partition reproducible

    # Loading packages
    list.of.packages <- c("psych", "zoo", "reshape2", "car","taRifx", "ggplot2", "tidyr","dplyr","foreach", "doParallel")       
    lapply(list.of.packages, require, character.only = TRUE) 

    # Before subsampling, I need to delete participants that do not have enough trials
    n_trials <- data %>%
      group_by(cond, id) %>%
      tally() 

    ids.out <- n_trials[n_trials$n <= s,] %>% dplyr::select(cond, id) %>% unite(cond_id, cond, id)

    # # This is similar to the loop above however, I need to create a subsampled dataset first
    df_temp_s <- data %>%
      unite(cond_id, cond, id, remove = F) %>%
      dplyr::filter(!cond_id %in% ids.out$cond_id) %>% 
      group_by(cond, id) %>% 
      sample_n(s)

    # Now I can do the split half on the subsampled dataset
    df_temp <- df_temp_s %>% 
      sample_frac(.50) %>%      # Creating a random half
      mutate(bin = 1) %>%       # Indicating that this is the first half 
      dplyr::select(id, trial,cond, bin) %>%  # Keeping only vars of interest
      right_join(df_temp_s, by = c("id", "trial", "cond")) %>%  # Bringing in the other half of the trials
      mutate(bin = if_else(is.na(bin), 2, 1)) %>%  # Creating the index for the second half of the trials
      group_by(cond, id, bin) %>%    # Grouping by vars of interest
      dplyr::summarise(Mean_amp = mean(value, na.rm = T)) %>% # Getting the mean amplitude by vars of interest
      unite(cond_bin, cond, bin)  %>% # Creating the variable name
      spread(cond_bin, Mean_amp) # Going to wide using our new variable

    # Checking the reliability
    r_temp <- corr.test(df_temp[,-c(1)])$r
    n_temp <- corr.test(df_temp[,-c(1)])$n
    r_temp <- (2*r_temp)/(1 + abs(r_temp)) # SB formula

    conditions_temp <- unique(gsub("(.*)_.*","\\1",row.names(r_temp))) # removing everything after the last "_" and only keeping unique conditions  

    r_temp_df <- data.frame()
    for (j in seq(2,sqrt(length(r_temp)), 2) ) { # Setting up a loop over the number of conditions
      r_temp_df_j <- r_temp[j,j-1]               # getting the index for that condition it is the row number and the column of j - 1
      r_temp_df_j <- data.frame(variable = conditions_temp[j/2], rcoeff = r_temp_df_j) # Getting the name of the condition and it has to be divided by two - This should match the conditions and they are sorted alphabetically! 
      r_temp_df_j$n <- ifelse(length(n_temp) == 1, n_temp, ifelse(length(n_temp) > 1, n_temp[j,j-1], "Error!")) # getting the n for that condition it is the same for all or the row number and the column of j - 1
      r_temp_df_j$seed = seed # Setting the seed 
      r_temp_df_j$n_trials = s # Setting the number that was subsampled by

      r_temp_df <- bind_rows(r_temp_df, r_temp_df_j)

    }

    return(r_temp_df)
  }
}

# Creating function with difference score
splithalf_trials_diff_parallel <- function(data, n_from, n_to, n_by, n_subsamples) { 
  # Eg., df_r <- splithalf_trials(dfm.ag, 2, 100, 5, 10)

  registerDoParallel(4)  # use multicore, set to the number of our cores
  opts <- list(chunkSize=2)
  results_df <- foreach(s = seq(n_from, n_to, n_by), .combine='rbind', .options.nws=opts) %:% 
  foreach(i = 1:n_subsamples, .combine='rbind') %dopar% { # Indicating how many subsamples
    # df_temp <- dfm.ag[dfm.ag$id == i, ]  
    seed = sample(1:10000000, 1)
    set.seed(seed)   ## set the seed to make your partition reproducible

    # Loading packages
    list.of.packages <- c("psych", "zoo", "reshape2", "car","taRifx", "ggplot2", "tidyr","dplyr","foreach", "doParallel")       
    lapply(list.of.packages, require, character.only = TRUE) 

    # Before subsampling, I need to delete participants that do not have enough trials
    n_trials <- data %>%
      group_by(cond, id) %>%
      tally() 

    ids.out <- n_trials[n_trials$n <= s,] %>% dplyr::select(cond, id) %>% unite(cond_id, cond, id)

    # # This is similar to the loop above however, I need to create a subsampled dataset first
    df_temp_s <- data %>%
      unite(cond_id, cond, id, remove = F) %>%
      dplyr::filter(!cond_id %in% ids.out$cond_id) %>% 
      group_by(cond, id) %>% 
      sample_n(s)

    # Now I can do the split half on the subsampled dataset
    df_temp <- df_temp_s %>% 
      sample_frac(.50) %>%      # Creating a random half
      mutate(bin = 1) %>%       # Indicating that this is the first half 
      dplyr::select(id, trial,cond, bin) %>%  # Keeping only vars of interest
      right_join(df_temp_s, by = c("id", "trial", "cond")) %>%  # Bringing in the other half of the trials
      mutate(bin = if_else(is.na(bin), 2, 1)) %>%  # Creating the index for the second half of the trials
      group_by(cond, id, bin) %>%    # Grouping by vars of interest
      dplyr::summarise(Mean_amp = mean(value, na.rm = T)) %>% # Getting the mean amplitude by vars of interest
      unite(cond_bin, cond, bin)  %>% # Creating the variable name
      spread(cond_bin, Mean_amp) # Going to wide using our new variable

    # Creating difference scores automatically across conditions bc it is hard 
    dfm_temp <- melt(df_temp, id.vars = "id")
    dfm_temp$half <- sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 3) 
    dfm_temp$Condition <- paste0(sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 1), "_", sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 2))
    df_dif_temp <- pivot_wider(data = dfm_temp, id_cols = c(id, half), names_from = c(Condition), values_from = value) %>% 
      mutate(Diff = NS_Error - NS_Correct) %>%
      melt(., id.vars = c("id", "half")) %>% 
      pivot_wider(id_cols = c(id), names_from = c(variable, half), values_from = value)

    # Checking the reliability
    r_temp <- corr.test(df_dif_temp[,-c(1)])$r
    n_temp <- corr.test(df_dif_temp[,-c(1)])$n
    r_temp <- (2*r_temp)/(1 + abs(r_temp)) # SB formula

    conditions_temp <- unique(gsub("(.*)_.*","\\1",row.names(r_temp))) # removing everything after the last "_" and only keeping unique conditions  

    r_temp_df <- data.frame()
    for (j in seq(2,sqrt(length(r_temp)), 2) ) { # Setting up a loop over the number of conditions
      r_temp_df_j <- r_temp[j,j-1]               # getting the index for that condition it is the row number and the column of j - 1
      r_temp_df_j <- data.frame(variable = conditions_temp[j/2], rcoeff = r_temp_df_j) # Getting the name of the condition and it has to be divided by two - This should match the conditions and they are sorted alphabetically! 
      r_temp_df_j$n <- ifelse(length(n_temp) == 1, n_temp, ifelse(length(n_temp) > 1, n_temp[j,j-1], "Error!")) # getting the n for that condition it is the same for all or the row number and the column of j - 1
      r_temp_df_j$seed = seed # Setting the seed 
      r_temp_df_j$n_trials = s # Setting the number that was subsampled by

      r_temp_df <- bind_rows(r_temp_df, r_temp_df_j)

    }

    return(r_temp_df)
  }
}

# Creating function with difference score
splithalf_overall_diff_parallel <- function(data, n_subsamples) { 
  # Eg., df_r <- splithalf_trials(dfm.ag, 2, 100, 5, 10)

  registerDoParallel(4)  # use multicore, set to the number of our cores
  opts <- list(chunkSize=2)
  results_df <- foreach(i = 1:n_subsamples, .combine='rbind') %dopar% { # Indicating how many subsamples
    # df_temp <- dfm.ag[dfm.ag$id == i, ]  
    seed = sample(1:10000000, 1)
    set.seed(seed)   ## set the seed to make your partition reproducible

    # Loading packages
    list.of.packages <- c("psych", "zoo", "reshape2", "car","taRifx", "ggplot2", "tidyr","dplyr","foreach", "doParallel")       
    lapply(list.of.packages, require, character.only = TRUE) 

    # Now I can do the split half on the subsampled dataset
    df_temp <- data %>% 
      group_by(cond, id) %>%
      sample_frac(.50) %>%      # Creating a random half
      mutate(bin = 1) %>%       # Indicating that this is the first half 
      dplyr::select(id, trial,cond, bin) %>%  # Keeping only vars of interest
      right_join(data, by = c("id", "trial", "cond")) %>%  # Bringing in the other half of the trials
      mutate(bin = if_else(is.na(bin), 2, 1)) %>%  # Creating the index for the second half of the trials
      group_by(cond, id, bin) %>%    # Grouping by vars of interest
      dplyr::summarise(Mean_amp = mean(value, na.rm = T)) %>% # Getting the mean amplitude by vars of interest
      unite(cond_bin, cond, bin)  %>% # Creating the variable name
      spread(cond_bin, Mean_amp) # Going to wide using our new variable

    # Creating difference scores automatically across conditions bc it is hard 
    dfm_temp <- melt(df_temp, id.vars = "id")
    dfm_temp$half <- sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 3) 
    dfm_temp$Condition <- paste0(sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 1), "_", sapply(strsplit(as.character(dfm_temp$variable),"_"), `[`, 2))
    df_dif_temp <- pivot_wider(data = dfm_temp, id_cols = c(id, half), names_from = c(Condition), values_from = value) %>% 
      mutate(Diff = NS_Error - NS_Correct) %>%
      melt(., id.vars = c("id", "half")) %>% 
      pivot_wider(id_cols = c(id), names_from = c(variable, half), values_from = value)

    # Checking the reliability
    r_temp <- corr.test(df_dif_temp[,-c(1)])$r
    n_temp <- corr.test(df_dif_temp[,-c(1)])$n
    r_temp <- (2*r_temp)/(1 + abs(r_temp)) # SB formula

    conditions_temp <- unique(gsub("(.*)_.*","\\1",row.names(r_temp))) # removing everything after the last "_" and only keeping unique conditions  

    r_temp_df <- data.frame()
    for (j in seq(2,sqrt(length(r_temp)), 2) ) { # Setting up a loop over the number of conditions
      r_temp_df_j <- r_temp[j,j-1]               # getting the index for that condition it is the row number and the column of j - 1
      r_temp_df_j <- data.frame(variable = conditions_temp[j/2], rcoeff = r_temp_df_j) # Getting the name of the condition and it has to be divided by two - This should match the conditions and they are sorted alphabetically! 
      r_temp_df_j$n <- ifelse(length(n_temp) == 1, n_temp, ifelse(length(n_temp) > 1, n_temp[j,j-1], "Error!")) # getting the n for that condition it is the same for all or the row number and the column of j - 1
      r_temp_df_j$seed = seed # Setting the seed 

      r_temp_df <- bind_rows(r_temp_df, r_temp_df_j)

    }

    return(r_temp_df)
  }
}

Computing Reliability

dfm.ag$cond <- as.factor(dfm.ag$Condition) # This variable needs to be a factor
system.time({ df_r_overall_rel <- splithalf_overall_diff_parallel(data = dfm.ag, n_subsamples = 3000) }) # Deleting three participants that did not meet goo reliability
describeBy(df_r_overall_rel$rcoeff, df_r_overall_rel$variable)

# Here is another way to do it - more efficiently using the splithalf package - it is like 15x-20x faster than my approach 
library(splithalf)
splithalf(data = dfm.ag,
                 outcome = "RT",
                 score = "average",
                 conditionlist = c("NS_Correct", "NS_Error"),
                 halftype = "random",
                 permutations = 5000,
                 var.RT = "value",
                 var.condition = "cond",
                 var.participant = "id",
                 average = "mean")

Computing SME

##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### #####
# Computing the Standardized measurement error (SME) based on Luck et al 2021, Psychophysiology
df_SME <- dfm.ag %>%
  group_by(Condition, id) %>%
  dplyr::summarise(SME = (sd(value)/sqrt(length(valuelu))))
yanbin-niu commented 2 years ago

@SMoralesPhD Hey Santi, thanks for posting the R code. I am curious about that: 1) in terms of computing condition differences and reliability, do you have any reading materials that explain basic ideas / theories and the algorithms behind your code? 2) while computing SME, does your code do the same thing as the ERPLab does? since I read the paper you posted early on, it seems they integrated that into the ERPLab plugin.

SMoralesPhD commented 2 years ago

Hello @yanbin-niu, I think probably our paper under review would be a good place to start. The other one would be readings related to the splithalf package. Here are links to these readings. Please let me know if you have any questions. I am happy to meet and discuss. Thanks!

Readings on condition differences and split half reliability: 10.31234/osf.io/ag9s7

Splithalf readings: https://journals.sagepub.com/doi/full/10.1177/2515245919879695 https://joss.theoj.org/papers/10.21105/joss.03041.pdf

yanbin-niu commented 2 years ago

@SMoralesPhD Thank you!! Those readings are really helpful!

DMRoberts commented 2 years ago

Thanks @SMoralesPhD:

1) From your code and the 2021 Luck et al. paper, it seems that what they call the 'Standard Measurement Error' is the Standard Error of Measurement (SEM)?

2) In the splithalf package, I see that there is an option to specify if the outcome variable represents response time or accuracy. Do you happen to know why this is specified / how the computation differs between the two? The splithalf documentation states the following but it isn't clear where the process diverges for the two:

What type of data do you have?
Are you interested in response times, or accuracy rates?

Knowing this, you can set outcome = “RT”, or outcome = “accuracy”
DMRoberts commented 2 years ago

Nothing final, but here is a gist of a preliminary attempt in Python (for only a single condition for now):

https://gist.github.com/DMRoberts/886a1f87f382b8f7ee25a3f2098367fc

yanbin-niu commented 2 years ago

@SMoralesPhD

Here is a gist of computing SME: https://github.com/NDCLab/pepper-pipeline/blob/bdcc16a2172601c3e83226616cb235b92ce6e5b7/scripts/postprocess/postprocess.py#L72

Not sure I understand SME correctly. I think they are a list of values for each participant under each condition. It would be great if @SMoralesPhD can help confirm I am on the right track. Thank you!!

SMoralesPhD commented 2 years ago

This is great @yanbin-niu ! I am just looking at your script, so I am not testing it (e.g., not 100% sure if you are pulling the right dimensions, etc). However, the general approach looks right to me! You are correct that SME is basically a set of values per participant for each condition (e.g., for two conditions, we would have two values per participant). Each of those values is just the SD across trials for a given person and condition divided by the square root of the number of trials. Please let me know if that is still not clear. I am happy to walk through it together with data to make sure it is right. Thanks!

yanbin-niu commented 2 years ago

Thank you @SMoralesPhD for helping confirm this! I think @F-said has already run some data on the splithalf reliability and SME and will be working on saving the results into a csv file. I think he will be presenting some results soon !

yanbin-niu commented 2 years ago

Hi @SMoralesPhD , we got some very, very preliminary results for reliability and SME. -- do you have any ideas or comments? Thank you!!

Reliability

image

SME

image

Note, the unit is V rather than uV.

Two tasks we run were all Surround Suppression task, but two blocks. And there are 6 participants for block 1, and 4 participants for block 2.

Data was epoched from -1s to 2s, stim locked. Baseline, -200 to 0ms ERP: 100 to 150ms based on E75(Oz)

@georgebuzzell @DMRoberts @F-said

SMoralesPhD commented 2 years ago

Hello @yanbin-niu,

I think this is looking good. About the reliability estimates, it is a bit surprising that you are getting a negative correlation, but I do not think we should be concerned because it is only 4 participants. Let's wait until we add more participants. We usually do not run reliability with less than 6 participants and even that is pushing it.
About the SME, it is a bit confusing because the values are in Volts. However, I think this is also looking good. The values I have seen on papers and some of our data are around 1 using uV. But have seen them as high as 8 for individual participants.

In sum, looks good to me! That correlation will hopefully turn positive as you add more participants. Great to see this moving along! Please let me know if you have any other questions. Thanks!

yanbin-niu commented 2 years ago

@SMoralesPhD Thanks for the quick response! This is great to hear the values for the first run were not too bad. Will add more people and see how it works! Also, for the unit, that is because MNE uses V. I will change it to uV for SME outputting given that uV is more commonly used.

  1. @F-said Would you please either add more participants or pool block 1&2 together, or both? I think this is great for the next run, we can have 20+ or 30+ participants (the more the better!😆) Thank you!

  2. I will do the unit and other formatting (so we can see which SME comes from who)