manybabies / mb1-cdi-followup

ManyBabies 1 Longitudinal CDI Followup
MIT License
4 stars 7 forks source link

Code for calculating ICC #10

Closed kristabh closed 1 year ago

kristabh commented 3 years ago

Some code we might want to use to calculate ICC (from a different project)

library(psych)

mb1_git <- read_csv("https://github.com/manybabies/mb1-analysis-public/raw/master/processed_data/03_data_diff_main.csv") %>%

select(lab, subid, diff, stimulus_num, method, age_group) %>%

mutate(subid = paste(lab, subid, sep = "_"))

write.csv(mb1_git, "mb1.csv")

Read in archived MB1 data

mb1 <- read_csv("mb1.csv")

Must have data in wide dataframe (one row per participant, observations across columns), rather than long dataframe

mb1_wide <- mb1 %>% filter(!is.na(diff)) %>% # Removes NA values that were causing problems pivot_wider(id_cols = c(lab, subid, method, age_group), names_from = stimulus_num, values_from = diff)

Must remove all factor columns (lab, studyid, method, age_group), leaving just matrix of values needed to compute ICC

mb1_icc <- mb1_wide %>% select(-lab, -subid, -method, -age_group)

mb1_icc <- mb1_wide %>% select(-lab, -subid, -method, -age_group)

Compute ICC values.

ICC(mb1_icc, missing = FALSE, lmer = TRUE)

angelinetsui commented 3 years ago

@kristabh : I added some codes (please see 04.exploratory_analysis.Rmd) to explore how ICC increases with different exclusion critieria. Please check and see what you think. In general, even we restrict our sample to infants who has no missing data, the ICC only is 0.20. Really small, but it increases when we use stricter exclusion criteria. I think the results make senses!

kristabh commented 3 years ago

@angelinetsui I can't seem to find the code you're referring to in 04.exploratory_analysis.Rmd . Did you push to Github?

angelinetsui commented 3 years ago

@kristabh , Hi Krista, I pushed but the codes are now in pull request as I am not an admin in this repo......I can't merge changes. I am adding Melanie here to see if she can push the changes @melsod

melsod commented 3 years ago

I believe I have now given Angeline admin control. There appear to be merge conflicts on this pull request.

angelinetsui commented 3 years ago

I don't know I was not able to resolve the conflicts. This is really weird :-(

angelinetsui commented 3 years ago

Can you resolve it, Melanie @melsod ?

kristabh commented 3 years ago

I was able to resolve and merge @angelinetsui @melsod

angelinetsui commented 3 years ago

Thanks very much @kristabh

melsod commented 3 years ago

Thanks Krista!!!!

On Wed, Jun 2, 2021 at 2:01 PM angelinetsui @.***> wrote:

Thanks very much @kristabh https://github.com/kristabh

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/manybabies/mb1-cdi-followup/issues/10#issuecomment-853309895, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIRK5TE2EDY6RRH24CHHVEDTQZ5Z3ANCNFSM42PIFRYA .

kristabh commented 3 years ago

Just pushed more code with a power analysis based on the reliabilities. The highest reliability we get is with the strictest inclusion criterion = .22, but sample size drops a lot (from n = 505 to n = 158). These have opposing relationships with power. Once I ran the power analysis, it seems that we get the most power is with the loosest inclusion criterion, because of the gains in sample size. With 80% power, we can detect true correlations of as low as .369 (assuming CDI has reliability of .86) or .351 (if CDI has reliability of .95 - this is based on the literature). But the observed correlations will be really small due to attenuation of correlation due to unreliability. So at these true correlations, the observed correlation would be only .12, but this would be statistically significant.

@angelinetsui please double check if you could @melsod you might find these results interesting

melsod commented 2 years ago

@kristabh I am pondering how best/if to incorporate this into the write-up (finally trying to pull all the threads together as Luis has finished his exploratory analysis with the non-English data). Do you think it would make sense to include this as an actual analysis in the Results? Or just refer to it in the Discussion? I think we were leaning toward the latter last time we talked about this.

kristabh commented 2 years ago

I just looked at the manuscript, and I would lean towards including it in the discussion rather than in the results. Given the other results are pre-registered and motivated in the intro, but this one is post-hoc and is more of an explanation for the observed results. I would want to include some values, though, so if you felt uncomfortable including numbers in the discussion then it might need to go in the results.

melsod commented 1 year ago

@kristabh I don't think we ever really addressed this directly in the DIscussion? We reference your 2022 paper in the Discussion, and the sensitivity analysis is reported in the supps. Just wanted to make sure you didn't want to address this more directly in the Discussion. I think it's OK as it is, but there's room for more if you want to add a sentence or two more directly referencing this.

kristabh commented 1 year ago

Just read over the discussion and I think that the paragraphs on p. 29-30 do an excellent job! I don't have anything futher to add.