DS4PS / cpp-528-fall-2020

Course shell for CPP 528 Foundations of Data Science III - Project Management
http://ds4ps.org/cpp-528-fall-2020/
1 stars 1 forks source link

Lab 5 - Control Variables from Lab 4 #38

Open ecking opened 3 years ago

ecking commented 3 years ago

Hello,

I'm working on the second part of Lab 2 where we work on the diff in diff model and I'm just not entirely sure I know how to add my variables from Lab 4. Was wondering if someone could explain this a bit.

cenuno commented 3 years ago

Hi Elyse,

This instruction is meant for you to re-use your control - or independent - variables from Lab 04 when creating your diff-in-diff model for Lab 05.

I believe in your Lab 04, you had the following control variables:

which were used to explain median home value growth (your dependent variable).

You would reuse those three control variables for Lab 05 by also reusing the logic you used to create them for your model from Lab 04.

Since you’re adding federal data, your final model should also contain federal program data to allow you to estimate the impact of both the NMTC and the LIHTC programs.

Respectfully,

Cristian

— Cristian E. Nuno


From: ecking notifications@github.com Sent: Sunday, November 15, 2020 1:49:04 PM To: DS4PS/cpp-528-fall-2020 cpp-528-fall-2020@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [DS4PS/cpp-528-fall-2020] Lab 5 - Control Variables from Lab 4 (#38)

Hello,

I'm working on the second part of Lab 2 where we work on the diff in diff model and I'm just not entirely sure I know how to add my variables from Lab 4. Was wondering if someone could explain this a bit.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/DS4PS/cpp-528-fall-2020/issues/38, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFZB2SYPJV52JN5PTEH7ZQDSQBEFBANCNFSM4TWOIXNQ.

ecking commented 3 years ago

Hello Professor,

I think my confusion is, I'm not 100% sure how I should go about adding my control variables to the regression model.

Do I add values like x1 = p.unempl x2 = p.hs?

cenuno commented 3 years ago

Hi Elyse,

Ah I see. Let me go back to the sample code provided in the tutorial, except this time let me add one control variables that changes during the two time periods: % of college education population per tract.

Note: this variable does not by come by default. Somewhere upstream I am assuming this variable exists since you used it in your Lab 04 model. Be mindful that my example variables of p_col_00 and p_col_10 may not be how you named these variables and, as a consequence, would return errors if you copy and pasted this sample code.

# store DV in both time periods
y1 <- log( d$mhv.00 )
y2 <- log( d$mhv.10 )

# store IV in both time periods
# note: you should not run this literally as I am not sure how you are naming your 
p_col_00 <- d$p_col_00
p_col_00 <- d$p_col_10

treat <- as.numeric( d$num.nmtc > 0 )

# add DV and IVs into one data frame per time period
d1 <- data.frame( y=y1, treat=treat, post=0, p_col=p_col_00 )
d2 <- data.frame( y=y2, treat=treat, post=1, p_col=p_col_10 )

# stack the time periods together into one data frame
d3 <- rbind( d1, d2 )

# create a diff in diff model
m <- lm( y ~ p_col + treat + post + treat*post, data=d3 )

# view regression results
summary( m )

As you suggested, you would add the control variable in the same way that treatment and the post variables are added in the d1 and d2 data frames. Here, I assume you have two separate variables in your base data frame d that captures the % of college education population per tract.

Once these control variables are added to your model data frame, you run lm() in the same way you did for Lab 04.

— Cristian E. Nuno


From: ecking notifications@github.com Sent: Sunday, November 15, 2020 2:59:14 PM To: DS4PS/cpp-528-fall-2020 cpp-528-fall-2020@noreply.github.com Cc: Cristian Ernesto Nuno cenuno@syr.edu; Comment comment@noreply.github.com Subject: Re: [DS4PS/cpp-528-fall-2020] Lab 5 - Control Variables from Lab 4 (#38)

Hello Professor,

I think my confusion is, I'm not 100% sure how I should go about adding my control variables to the regression model.

Do I add values like x1 = p.unempl x2 = p.hs?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/DS4PS/cpp-528-fall-2020/issues/38#issuecomment-727652439, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFZB2S3O3YV6FWFCC5MNY7LSQBMMFANCNFSM4TWOIXNQ.