Watts-College / cpp-524-fall-2021

https://watts-college.github.io/cpp-524-fall-2021/
1 stars 0 forks source link

week-2 material questions #8

Open IemanHamid opened 2 years ago

IemanHamid commented 2 years ago

image

if we take one category at a time, ex. Entrepreneurial the group that took a loan and the one didn't take are not identical "there were reasons for them to choose taking a loan" or in another way ( if we randomly give loan to half of the entrepreneurial, the loan would be effective) and the same thing for the other category "NOT entrepreneurial" Can we say that the loan has an impact if we look at it this way??

danafuller commented 2 years ago

@lemanHamid I think the whole point of this example was to flush out the bias and what impact it has in that example. It seems to me that by doing what you suggest it introduces a hypothetical that is materially different in that it potentially removes the bias. I don't know if that helped you any....

lecy commented 2 years ago

The example was meant to show that when selection is present (people get to choose if they belong in the treatment or control group, they are not assigned by the evaluator) then we can't interpret the results at face value. For example, the treatment group here (takes a loan) ends up with higher income than the control group (does not take a loan). We are tempted to conclude that the program was effective as a result:

image

However, what the tables are showing is that the results are driven by sorting or selection, not by the loans changing conditions materially.

image

If we convert the table to a pre-post diagram the lack of impact will be a little more evident:

image

And if the program was effective:

image

This is actually a very good example for Lab 2 because it shows the importance of the balance criteria when using certain estimators. The calculations above are an example of the POST-TEST ONLY estimator (T2-C2). This estimator requires that the groups are balanced or statistically equivalent prior to the intervention. Which would not be the case here.

If we use the DIFF-IN-DIFF estimator ( [T2-T1] - [C2-C1] ) then it wouldn't matter if the groups are not identical in the pre-treatment period. It will still capture the program impact correctly. Note that we need pre-treatment data for this estimator, which is often the challenge. It's more robust but also more data-intensive.

if we take one category at a time, ex. if we randomly give loan to half of the entrepreneurial and half of the entrepreneurial group gets no loan, can we say that the loan has an impact if we look at it this way??

Yes, what you are describing is randomization. We randomize because it breaks the correlation between traits of people and the study group they belong to. In other words, it resolves the selection problem.

Can we say the loan has an impact in that case? It obviously depends on what the data would show after observing outcomes in randomized treatment and control groups. The story here should be the same - if the loan is not improving income, then we would have balanced treatment and control groups, which would allow us to more definitely demonstrate the lack of impact.

More importantly, the main point is developing intuition for when you can interpret differences in the study groups as program impact or a program "effect", versus when differences arise because of lack of study group equivalence and thus they capture selection and not impact.

lecy commented 2 years ago

if we take one category at a time, ex. if we randomly give loan to half of the entrepreneurial and half of the entrepreneurial group gets no loan, can we say that the loan has an impact if we look at it this way??

The other important insight is that "entrepreneurial capacity" is a latent construct like IQ or athleticism that is best measured during performance of tasks. Most programs would have no idea which clients would have high capacity beforehand so it would be hard to assign people to groups on that trait (as opposed to using something like age or gender).

That's the real advantage of randomization - if people were assigned randomly then in theory we will end up with an equal proportion of entrepreneurial individuals in the treatment and control groups, even when we can't measure or observe that trait explicitly.

If we are using something like matching to construct the groups we can guarantee that the groups are balanced on observable traits. But we can't guarantee the groups are balanced on unobservables like entrepreneurial ability.