ASKurz / Experimental-design-and-the-GLMM

MIT License
54 stars 7 forks source link

posttest-only control group designs #3

Open ASKurz opened 2 years ago

ASKurz commented 2 years ago

Please leave suggestions for studies using a posttest-only control group design. This could also include variants, such as with two active conditions versus a non-treatment control, or designs substituting treatment-as-usual (TAU) for a non-treatment control.

ASKurz commented 2 years ago

Consider Bond and Ellis (2013; https://doi.org/10.1111/ssm.12021), The effects of metacognitive reflective assessment on fifth and sixth graders' mathematics achievement. Here's the abstract:

The purpose of this experimental study was to investigate the effects of metacognitive reflective assessment instruction on student achievement in mathematics. The study compared the performance of 141 students who practiced reflective assessment strategies with students who did not. A posttest-only control group design was employed, and results were analyzed by conducting one-way analysis of variance (ANOVA) and nonparametric procedures. On both a posttest and a retention test, students who practiced reflective strategies performed significantly higher than students who did not use the strategies. A within-subjects ANOVA was also conducted six weeks following the intervention to assess how the factor of time affected retention levels. No significant difference was found between the posttest and retention test results for the experimental groups or the control group.

They authors have a 3-group design and two posttest assessment periods. Their primary outcome measure (test scores) gives a nice opportunity to walk out various strategies with the binomial likelihood.

ASKurz commented 2 years ago

Consider Wagenmakers et al (2016; https://doi.org/10.1177/1745691616674458), Registered replication report: Strack, Martin, & Stepper (1988). Here's the abstract:

According to the facial feedback hypothesis, people’s affective responses can be influenced by their own facial expression (e.g., smiling, pouting), even when their expression did not result from their emotional experiences. For example, Strack, Martin, and Stepper (1988) instructed participants to rate the funniness of cartoons using a pen that they held in their mouth. In line with the facial feedback hypothesis, when participants held the pen with their teeth (inducing a “smile”), they rated the cartoons as funnier than when they held the pen with their lips (inducing a “pout”). This seminal study of the facial feedback hypothesis has not been replicated directly. This Registered Replication Report describes the results of 17 independent direct replications of Study 1 from Strack et al. (1988), all of which followed the same vetted protocol. A meta-analysis of these studies examined the difference in funniness ratings between the “smile” and “pout” conditions. The original Strack et al. (1988) study reported a rating difference of 0.82 units on a 10-point Likert scale. Our meta-analysis revealed a rating difference of 0.03 units with a 95% confidence interval ranging from −0.11 to 0.16. (emphasis in the original)

The authors used a 2-group ("frown" and "smile") design across 17 labs. Each lab randomized about 150 persons into either of the two groups, with a total sample size of about 2,000 (depending on the exclusion criteria). The dependent variable is ratings on a 0-9 Likert-type scale for four cartoons. Given the nesting and the approximate distributions of the data, they make nice opportunities for MELSM models using either the Gaussian or cumulative-probit likelihoods (ideally both). You could fit a large 3-level model to analyze the data in all 17 labs, or perhaps just analyze the data from one of the labs in this chapter and save the full 17-lab analysis for a later chapter on large-scale multisite replication designs.

Other bonuses is Wagenmakers and colleagues made their data, code, and other materials available on the OSF at https://osf.io/pkd65/.

ASKurz commented 2 years ago

Consider Sarafoglou et al (2023; https://doi.org/10.1177/25152459221128319), Comparing analysis blinding with preregistration in the Many-Analysts Religion Project. Here's the abstract:

In psychology, preregistration is the most widely used method to ensure the confirmatory status of analyses. However, the method has disadvantages: Not only is it perceived as effortful and time-consuming, but reasonable deviations from the analysis plan demote the status of the study to exploratory. An alternative to preregistration is analysis blinding, in which researchers develop their analysis on an altered version of the data. In this experimental study, we compare the reported efficiency and convenience of the two methods in the context of the Many-Analysts Religion Project. In this project, 120 teams answered the same research questions on the same data set, either preregistering their analysis (n = 61) or using analysis blinding (n = 59). Our results provide strong evidence (Bayes factor [BF] = 71.40) for the hypothesis that analysis blinding leads to fewer deviations from the analysis plan, and if teams deviated, they did so on fewer aspects. Contrary to our hypothesis, we found strong evidence (BF = 13.19) that both methods required approximately the same amount of time. Finally, we found no and moderate evidence on whether analysis blinding was perceived as less effortful and frustrating, respectively. We conclude that analysis blinding does not mean less work, but researchers can still benefit from the method because they can plan more appropriate analyses from which they deviate less frequently.

The authors have a 2-group design (n = 61 and n = 59). Their preregistration lives at https://osf.io/2cdht, and the files and data for their paper live on the OSF at https://osf.io/gkxqy/files/osfstorage. Their Hypothesis 4 compares the number of analytic deviations from each research teams' planned analysis, and that number is a nice zero-inflated count. Sarafoglou proposed a zero-inflated Poisson model for those data, and that approach works well. This a very rare example of open zero-inflated count data in psychology.