Closed kristabh closed 1 year ago
@kristabh : I added some codes (please see 04.exploratory_analysis.Rmd) to explore how ICC increases with different exclusion critieria. Please check and see what you think. In general, even we restrict our sample to infants who has no missing data, the ICC only is 0.20. Really small, but it increases when we use stricter exclusion criteria. I think the results make senses!
@angelinetsui I can't seem to find the code you're referring to in 04.exploratory_analysis.Rmd . Did you push to Github?
@kristabh , Hi Krista, I pushed but the codes are now in pull request as I am not an admin in this repo......I can't merge changes. I am adding Melanie here to see if she can push the changes @melsod
I believe I have now given Angeline admin control. There appear to be merge conflicts on this pull request.
I don't know I was not able to resolve the conflicts. This is really weird :-(
Can you resolve it, Melanie @melsod ?
I was able to resolve and merge @angelinetsui @melsod
Thanks very much @kristabh
Thanks Krista!!!!
On Wed, Jun 2, 2021 at 2:01 PM angelinetsui @.***> wrote:
Thanks very much @kristabh https://github.com/kristabh
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/manybabies/mb1-cdi-followup/issues/10#issuecomment-853309895, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIRK5TE2EDY6RRH24CHHVEDTQZ5Z3ANCNFSM42PIFRYA .
Just pushed more code with a power analysis based on the reliabilities. The highest reliability we get is with the strictest inclusion criterion = .22, but sample size drops a lot (from n = 505 to n = 158). These have opposing relationships with power. Once I ran the power analysis, it seems that we get the most power is with the loosest inclusion criterion, because of the gains in sample size. With 80% power, we can detect true correlations of as low as .369 (assuming CDI has reliability of .86) or .351 (if CDI has reliability of .95 - this is based on the literature). But the observed correlations will be really small due to attenuation of correlation due to unreliability. So at these true correlations, the observed correlation would be only .12, but this would be statistically significant.
@angelinetsui please double check if you could @melsod you might find these results interesting
@kristabh I am pondering how best/if to incorporate this into the write-up (finally trying to pull all the threads together as Luis has finished his exploratory analysis with the non-English data). Do you think it would make sense to include this as an actual analysis in the Results? Or just refer to it in the Discussion? I think we were leaning toward the latter last time we talked about this.
I just looked at the manuscript, and I would lean towards including it in the discussion rather than in the results. Given the other results are pre-registered and motivated in the intro, but this one is post-hoc and is more of an explanation for the observed results. I would want to include some values, though, so if you felt uncomfortable including numbers in the discussion then it might need to go in the results.
@kristabh I don't think we ever really addressed this directly in the DIscussion? We reference your 2022 paper in the Discussion, and the sensitivity analysis is reported in the supps. Just wanted to make sure you didn't want to address this more directly in the Discussion. I think it's OK as it is, but there's room for more if you want to add a sentence or two more directly referencing this.
Just read over the discussion and I think that the paragraphs on p. 29-30 do an excellent job! I don't have anything futher to add.
Some code we might want to use to calculate ICC (from a different project)
library(psych)
mb1_git <- read_csv("https://github.com/manybabies/mb1-analysis-public/raw/master/processed_data/03_data_diff_main.csv") %>%
select(lab, subid, diff, stimulus_num, method, age_group) %>%
mutate(subid = paste(lab, subid, sep = "_"))
write.csv(mb1_git, "mb1.csv")
Read in archived MB1 data
mb1 <- read_csv("mb1.csv")
Must have data in wide dataframe (one row per participant, observations across columns), rather than long dataframe
mb1_wide <- mb1 %>% filter(!is.na(diff)) %>% # Removes NA values that were causing problems pivot_wider(id_cols = c(lab, subid, method, age_group), names_from = stimulus_num, values_from = diff)
Must remove all factor columns (lab, studyid, method, age_group), leaving just matrix of values needed to compute ICC
mb1_icc <- mb1_wide %>% select(-lab, -subid, -method, -age_group)
mb1_icc <- mb1_wide %>% select(-lab, -subid, -method, -age_group)
Compute ICC values.
ICC(mb1_icc, missing = FALSE, lmer = TRUE)