Watts-College / cpp-524-fall-2021

https://watts-college.github.io/cpp-524-fall-2021/
1 stars 0 forks source link

Lab 1 Question 7 #5

Open lecy opened 2 years ago

lecy commented 2 years ago

Hello professor,

I have a question regarding Question 7 of Lab 1. I would like to know if the answer we are expected to provide should be only related to how study participants were assigned to groups T0 through T4 or about the entire process of group assignment.

Because if we look at how study participants were assigned to groups T0 through T4 it will be an RCT. However, if we consider the entire process it won’t be the case.

Please find below my detailed answer to the question:

"Based on the process followed for the assignment of study participants, we can say that this is not a pure RCT. The children that were selected for the treatment were chosen on the basis of following criteria: The lowest weight and height for their age, the highest number of clinical signs of malnutrition, and the lowest per capita income. After choosing the treatment group, a 2 square kilometers area was subdivided into 20 sectors, with 13 to 19 students occupying each sector. The 20 sectors were ranked based on the children’s height and weight for age and the per capita family income. Thereafter, the first 5 sectors were randomly assigned to one of the five groups and the same procedure was applied on the 3 sets that were left. The fact that the children were selected and assigned to the groups does not make the process pure RCT."

lecy commented 2 years ago

It's a good question and I do not think it is described especially well in the chapter.

But my interpretation is that they are using the weight and height criteria to identify the study population, and then describing how they assign all participants to groups.

The language is confusing because all of the 20 sectors are assigned to treatment groups, which makes it sound like only those with low height, weight, and income are in the treatment group and thus the control group must be kids with better nutrition and higher income, right?

Not exactly because of the study design. All study participants are in both the treatment and control groups. They just receive different levels of the treatment (number of periods in the treatment group).

Also, you are correct that kids are not randomly assigned to groups in the study. Blocks (2 square kilometer units), however, are randomly assigned. It is done incrementally by tranches (lowest five blocks assigned first), but that is to ensure balance in the groups since 20 is a small number of groups.

Is it perfect? No. Is it an RCT? Yes.

You will find that most RCTs or field experiments all balance trade-offs between what is perfect and what is feasible. The hard part is deciding for yourself when the design introduces problems that are significant enough that you would distrust the results.

This study utilizes a reasonable approach given the sorts of implementation challenges they face.

lecy commented 2 years ago

Here's a simple experiment to see the reasoning for the use of tranches in the assignment process.

Here Y is the outcome (think of it as a standardized test score percentile in the pre-treatment period). Kids are grouped into 20 different blocks (geographic units). Then assigned to 5 different study groups (4 blocks each).

Next week you will learn about happy randomization - the test of whether random assignment has achieved balanced groups. The larger your sample size the more likely any random assignment will achieve group balance. When you have small groups, however, you need to check this assumption.

In this case when we do not use tranches look at how unbalanced the groups will be:

# group traits prior to the study (they should all be equivalent)
study.group   ave.y
group1          30
group2          52
group3          60
group4          60
library( dplyr )
> y <- 1:100
> block <- rep( LETTERS[1:20], each=5 )
> 
> # randomize order of blocks
> x <- sample( LETTERS[1:20], 20 )
> 
> study.group <- NULL
> study.group[ block %in% x[1:5] ]   <- "group1"
> study.group[ block %in% x[6:10] ]  <- "group2"
> study.group[ block %in% x[11:15] ] <- "group3"
> study.group[ block %in% x[16:20] ] <- "group4"
> 
> d <- data.frame( block, y, study.group )
> head( d, 20 )
   block  y study.group
1      A  1      group3
2      A  2      group3
3      A  3      group3
4      A  4      group3
5      A  5      group3
6      B  6      group2
7      B  7      group2
8      B  8      group2
9      B  9      group2
10     B 10      group2
11     C 11      group1
12     C 12      group1
13     C 13      group1
14     C 14      group1
15     C 15      group1
16     D 16      group1
17     D 17      group1
18     D 18      group1
19     D 19      group1
20     D 20      group1
> 
> d %>% group_by( study.group ) %>% summarize( ave.y=mean(y) )
# A tibble: 4 x 2
  study.group   ave.y
  <chr>       <dbl>
1 group1         30
2 group2         52
3 group3         60
4 group4         60