DS4PS / cpp-525-sum-2021

Course shell for CPP 525 Advanced Regression Analysis
http://ds4ps.org/cpp-525-sum-2021/
0 stars 2 forks source link

Lab 05-Q1d #8

Open malmufre opened 3 years ago

malmufre commented 3 years ago

Hello, I am having an issue interpreting Question 1d: Compare the mean 8th grade math GPA for students that scored above and below 60 on the 7th grade standardized exam. Is there a statistically significant difference? Note that this data all comes from the pre-treatment period. (5 + 5 points)

I have answered it as the following:

plot(data$math_7,data$gpa_8)
data$Treatment1 <- ifelse( data$math_7 < 60, 1, 0 )
mean( data$gpa_8[ data$Treatment1==1 ] )
mean( data$gpa_8[ data$Treatment1==0 ] )

I am unsure if I am approaching this correctly as the results are: [1] 1.460703 [1] 2.86946

I assumed that I would be getting a higher number for the treatment group. Thank you!

lecy commented 3 years ago

Sorry, ignore that.

"Note that this data all comes from the pre-treatment period."

What are expectations about which group has higher scores in the pre-treatment period?

malmufre commented 3 years ago

The expectation is that the group that has a higher score on the standardized math exams at the end of 7th-grade, is expected to have a higher GPA in 8th grade. That is expected in the pre-treatment period. I think that is why gpa_8 for the treatment group is lower than gpa_8 for the control group. Is that correct?

lecy commented 3 years ago

Treatment group is often lower in both pre and post treatment periods. They are in the treatment because their performance is below average.

This is why the RDD uses a special counter factual, otherwise the comparison of the two groups in the post-treatment period would be misleading.

The important stat is not final score of kids in the treatment (performance will still be worse than control group), it’s how much they improved.