MUCollective / multiverse

R package for creating explorable multiverse analysis
https://mucollective.github.io/multiverse/
GNU General Public License v3.0
62 stars 5 forks source link

durante data #111

Closed ntaback closed 2 years ago

ntaback commented 2 years ago
one_universe <- durante %>%
  mutate( ComputedCycleLength = StartDateofLastPeriod - StartDateofPeriodBeforeLast, 
          NextMenstrualOnset = StartDateofLastPeriod + ComputedCycleLength,
          CycleDay = 28 - (NextMenstrualOnset - DateTesting),
          Relationship = factor(ifelse(Relationship==1 | Relationship==2, "Single", "Relationship")),
          Fertility = factor(ifelse(CycleDay >= 7 & CycleDay <= 14, "high", 
                                     ifelse(CycleDay >= 17 & CycleDay <= 25, "low", NA))),
          RelComp = round((Rel1 + Rel2 + Rel3)/3, 2))

one_universe %>% group_by(Fertility) %>% count()

gives me n=130 in high fertility and n=171 in low fertility. But, according to Durante et al. it should be n=131 and n=172, and of course I can’t get the same F-value, etc.

abhsarma commented 2 years ago

This seems to be as a result of missing data in the dataset used by Steegen et al. (lines 126-131 here):

as described in the Supplemental Material, for two participants, we did not manage to recover the value of Cycle Day. When Cycle Day is determined based on nmo1, we adopt the Cycle Day value from the original data file to ensure that the results of our single data set analysis are identical to the single data set analysis in Durante et al. (2013)

I had missed this in the vignettes (which translates to adding this line of code CycleDay = ifelse(WorkerID == 15, 11, ifelse(WorkerID == 16, 18, CycleDay)) within the mutate function after the definition of CycleDay.

I'll update the vignettes and documentation. Thanks for pointing this out!