DS4PS / cpp-523-fall-2019

Course shell for CPP 523 Foundations of Program Evaluation I for Fall 2019.
http://ds4ps.org/cpp-523-fall-2019/
6 stars 3 forks source link

Week 04 #12

Open sunaynagoel opened 5 years ago

sunaynagoel commented 5 years ago

Would you be bale to open the unit overview and lab for week 04 ? please.

sunaynagoel commented 5 years ago

Page #12-13 of Omitted Variable PDF. It says " Bias is difference between Truth and Naive Slopes" shouldn't it be other way around? Which is what it shows by calculation in page 14

"b1 =  Direct Effect + Indirect Effect 
β1 = Direct Effect 
bias = b1 – β1 = Indirect Effect "

Unless I am referring the terms intangibly.

sunaynagoel commented 5 years ago

LAB 04- Part I, Question 1

The code provided in the template and instructions for #1 , does not seem to be returning a table for for. It does not give an error either, it just returns a lot of texts in paragraph form.

m.full <- lm( SWB ~ ND + PSS, data=dat )
m.naive <- lm( SWB ~ ND, data=dat )

stargazer( m.naive, m.full, 
           type = "html", digits=2,
           dep.var.caption = "DV: Subjective Well-Being",
           omit.stat = c("rsq","f","ser"),
           notes.label = "Standard errors in parentheses")

Any one facing similar problem?

lecy commented 5 years ago

This is the one tricky thing about stargazer(). The function takes an R object as input, and returns HTML code that will create a pretty table once the file is knit. If you are working iteratively in the RMD document, however, and you want to run the chunk to see the output you need to change the type="html" argument to type="text". Otherwise you need to knit the document for the HTML code to make sense.

With small RMD files like this it works fine to just knit it when you need to view updates. If you are loading large files or doing complex functions that take time I usually change the argument to "text" so I can see the output in real-time, and change it back to "html" when I'm ready to create my final version.

It seems like a flaw in the design of stargazer(), but not sure if there is an easy fix and the package works great otherwise!

stargazer( m.naive, m.full, 
           type = "text", digits=2,
           # type = "html", digits=2,
           dep.var.caption = "DV: Subjective Well-Being",
           omit.stat = c("rsq","f","ser"),
           notes.label = "Standard errors in parentheses")
lecy commented 5 years ago

@sunaynagoel this confuses me every time as well. Another way to think about it is this:

b1 = total effect
B1 = direct effect
bias = indirect effect
direct effect = total effect - indirect effect
B1 = b1 - bias
b1 = B1 + bias
lecy commented 5 years ago

Unless you mean:

"Bias is the difference between Truth and Naive Slopes" should be, "Bias is the difference between the Naive Slope and the True Slope" ?

In that case, you are correct, that is a more precise statement from an order of operations standpoint.

I just meant conceptually it is the difference of the slope estimated from a "naive" (incomplete) model and the theoretic "full" model that includes all competing hypotheses.

sunaynagoel commented 5 years ago
SWB = B0 + B1 ND + B2 PSS + E1

If this is the equation for full regression, id there way to know E1 from regression table. I can still answer the lab without know it but I am just curious.

lecy commented 5 years ago

Nope, you have to calculate that.

sunaynagoel commented 5 years ago

Question #2 , part 1

Part 1:

"What happened to the slope of CSE? What happened to the standard error of CSE?

How would our assessment of CSE change after we control for baseline self-esteem? For example, if we are a psychologist working with students should we worry if we observe a case where a person has a high need for approval from others? Will it impact their happiness? Or should we focus on other things?"

I can see from the table that slope changes but standard error does not. Does it make CSE a control variable correlated to policy variable? According to that CSE makes model accurate but imprecise. But when I make venn diagram, then according to my math its changes both slope and standard error.. I am confused.

lecy commented 5 years ago

CSE (the need for approval) is the policy variable in this question. RSE is the control variable.

I added a hint to the question to make it clear:

How would our assessment of CSE change after we control for baseline self-esteem? For example, if we are a psychologist working with students should we worry if we observe a case where a person has a high need for approval from others? Will it impact their happiness? Or should we focus on other things?

Hint:

Self-esteem and the need for approval are operating as competing variables. In other words, when we estimate the naive model in (1) then estimate the model with both variables in (3) we see a big difference in the results.

Since significance guides are ability to make concrete policy recommendations, we typically include competing hypotheses because their presence can make the result disappear (like SES does with Class Size in the lecture notes). They usually results in one of two scenarios:

  1. Our policy slope is significant then becomes insignificant (we lose confidence)
  2. Our policy slope is insignificant then becomes signficant (we gain confidence)

This is an odd case, though, because we are “confident” about our slope on Model 1 (it’s highly statistically significant), and also confident in Model 3 (still highly significant). But what has changed about the slope? And how would that change our recommendations?

sunaynagoel commented 5 years ago

Thanks for all the help. I think I got it somewhat. This is the hardest LAB I have ever worked on.

lecy commented 5 years ago

This is the hardest lab of the semester. I'm hoping it's hard and not frustrating.

You don't go to grad school to be told what you already know!

sunaynagoel commented 5 years ago

It was not frustrating thanks to all the help you provided. It all came together but was slow and hardest part was combining all the concepts we have learned so far in a meaningful way. Thanks again.

lecy commented 5 years ago

"combining all the concepts ... in a meaningful way" ↑ ↑ ↑ That's the difference between learning the formulas and developing the intuition. Glad to hear it!

jmacost5 commented 5 years ago

I am not understanding how to do part 2 in the first question. Is it asking me to put in a code from the previous part to answer the next part?

m.auxiliary <- lm(  SWB ~ ND + PSS, data=dat )

stargazer( m.auxiliary,  
           type = "html", digits=2,
           dep.var.caption = "DV: PSS",
           omit.stat = c("rsq","f","ser"),
           notes.label = "Standard errors in parentheses")

a1 <- 
B2 <- 0.32 
bias <- 
bias
lecy commented 5 years ago

The questions ask you to calculate bias in two ways. First, by comparing slopes in the naive versus the full models (this gives you the intuition of what bias represents).

Then by computing the bias through the indirect effect estimate in the path diagram. This is meant to offer intuition about how the model is splitting the total effect into the direct and the indirect component (which will be important in a couple of semesters).

image

castower commented 5 years ago

Hello all,

I haven't seen anyone else post this issue, so maybe I'm doing something wrong, but it seems that an image is missing from my RMD file for Lab 04.

The following code:

![](figures/path-diagram.png)

is producing an error that reads '(No image at path)' because I do not have a path-diagram.png image. Is there a place to download this file or am I overlooking a step?

Thanks! Courtney

jmacost5 commented 5 years ago

Don’t worry it is not just you I just thought I was messing something u.

On Tue, Sep 17, 2019 at 12:33 AM Courtney notifications@github.com wrote:

Hello all,

I haven't seen anyone else post this issue, so maybe I'm doing something wrong, but it seems that an image is missing from my RMD file for Lab 04.

The following code:

is producing an error that reads '(No image at path)' because I do not have a path-diagram.png image. Is there a place to download this file or am I overlooking a step?

Thanks!

Courtney

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-523-fall-2019/issues/12?email_source=notifications&email_token=AM62SY3KAZ7WHUZVDKMMHMLQKCB33A5CNFSM4IVB6ZF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD63TCSQ#issuecomment-532099402, or mute the thread https://github.com/notifications/unsubscribe-auth/AM62SY3BLUFN3CWCFEX43BLQKCB33ANCNFSM4IVB6ZFQ .

-- Jestrii Acosta Biology, BA

lecy commented 5 years ago

It is the same diagram as Part 2 in the instructions. You don't need to include it in your solutions, but if you want to add it you can use the web URL for the diagram:

https://ds4ps.org/cpp-523-fall-2019/labs/figures/path-diagram.png

jmacost5 commented 5 years ago

The questions ask you to calculate bias in two ways. First, by comparing slopes in the naive versus the full models (this gives you the intuition of what bias represents).

Then by computing the bias through the indirect effect estimate in the path diagram. This is meant to offer intuition about how the model is splitting the total effect into the direct and the indirect component (which will be important in a couple of semesters).

image

I guess I am not understanding what to code for the m.auxiliary <- lm( SWB ~ ND, data=dat ), I also do not understand if I am reading the graph right. a1 <- 0.1024 B2 <- 0.32 bias <- B2*a1

castower commented 5 years ago

Thank you @lecy

lecy commented 5 years ago

lm( SWB ~ ND, data=dat ) is the code for the regression:

SWB = b0 + b1(ND) + e

Your auxiliary regression is incorrect. You want to calculate the indirect path from ND to SWB through PSS. You do this in two steps, one to calculate a1, and the second using B2 from the full model.

a1 represents the impact ND has on PSS: ND --> PSS

So to get a1 you need to estimate: PSS = a0 + a1(ND) + e

Then (a1)(B2) gives you the indirect impact achieved through the path ND --> PSS --> SWB.

JasonSills commented 5 years ago

I'm seeing an error when I knit the file: image

lecy commented 5 years ago

What's your code in the chunk that contains line 58? I'm guessing an incomplete line, missing a parentheses?

JasonSills commented 5 years ago
m.full <- lm( SWB ~ ND + PSS, data=dat )
JasonSills commented 5 years ago

It's the second line below. It's the model for the second regression output in question 1.


m.full <- lm( SWB ~ ND + PSS, data=dat )
m.naive <- lm( SWB ~ NDm.full <- lm( SWB ~ ND + PSS, data=dat )
m.naive <- lm( SWB ~ ND, data=dat ), data=dat )

stargazer( m.naive, m.full, 
           type = "html", digits=2,
           dep.var.caption = "DV: Subjective Well-Being",
           omit.stat = c("rsq","f","ser"),
           notes.label = "Standard errors in parentheses")
lecy commented 5 years ago

Yep, that's the culprit. Looks like a copy-paste error. You got it from here?

lecy commented 5 years ago

3rd line too has an extra dataset.

castower commented 5 years ago

To clarify, to find the calculation of a1, we use the chart that lm( PSS ~ ND , data=dat ) generates in the same that we used lm( SWB ~ ND, data=dat ) to calculate b1, correct?

Am I following the right path of reasoning in my understanding that a1 is the variable that the auxiliary regression calculates to estimate the distance between X1 and X2, in the same way that b1 is the variable that the naive regression calculates to estimate the distance between x1 and Y? The bias is the amount that each of these "estimates" are wrong from the "true" full regression, correct?

cjbecerr commented 5 years ago

For Q1 Part 3, it says b1 / B1 as a measure of magnitude of bias. Is this supposed to be the same as what you have in the reading where you show the size of the bias = (b1 - B1) / B1? Or, how does this differ?

castower commented 5 years ago

Is anybody else having trouble with their knitted file not properly producing the tables? Both my code and the table looks really odd in the knitted file: Screen Shot 2019-09-17 at 11 31 24 AM

castower commented 5 years ago

Is anybody else having trouble with their knitted file not properly producing the tables? Both my code and the table looks really odd in the knitted file: Screen Shot 2019-09-17 at 11 31 24 AM

nevermind, I figured it out. I forgot to change the text back to html

JasonSills commented 5 years ago

Prof. Lecy,

I figured out the copy/paste issue, but now I'm running into an error with "dat" no found in the same line. The regression is running, so I'm not sure what the issue is. image

JasonSills commented 5 years ago

Prof. Lecy,

I figured out the copy/paste issue, but now I'm running into an error with "dat" no found in the same line. The regression is running, so I'm not sure what the issue is. image

Nevermind - I was able to get this to work.

castower commented 5 years ago

For Q1 Part 3, it says b1 / B1 as a measure of magnitude of bias. Is this supposed to be the same as what you have in the reading where you show the size of the bias = (b1 - B1) / B1? Or, how does this differ?

I'm wondering the same thing. Should this be (b1-B1)/B1 or are magnitude and size two different measures?

lecy commented 5 years ago

@castower and @cjbecerr this is an astute question.

Looking back at the equation you are correct that bias / B1 makes more sense. That would give the magnitude of the bias in proportional terms. I'll update the instructions, please use that metric.

Note that bias / B1 is the same as (b1-B1)/B1.

JasonSills commented 5 years ago

@lecy

I'm having a bit more trouble with the knitting. I'm receiving an error and can't find much on the web. image

lecy commented 5 years ago

It's looking for the path diagram figure in Part 2.

You can delete the image (you don't need it for your solutions) or replace "figures/path-diagram.png" with this URL: https://ds4ps.org/cpp-523-fall-2019/labs/figures/path-diagram.png

![](https://ds4ps.org/cpp-523-fall-2019/labs/figures/path-diagram.png)
jmacost5 commented 5 years ago
DV: PSS
PSS
ND1.81***
(0.28)
Constant73.88***
(1.69)
Observations389
Adjusted R20.10
Standard errors in parentheses*p<0.1; **p<0.05; ***p<0.01

this is what I get when I put m.auxiliary <- lm( PSS ~ ND , data=dat )

jmacost5 commented 5 years ago

now i see a table??? it was literally text in R

jmacost5 commented 5 years ago

Is anybody else having trouble with their knitted file not properly producing the tables? Both my code and the table looks really odd in the knitted file: Screen Shot 2019-09-17 at 11 31 24 AM

nevermind, I figured it out. I forgot to change the text back to html

omg me 2 thanks for that I had an oh moment

cjbecerr commented 5 years ago

Are Q2 part 2 and Q1 part 4 meant to be very different? I'm finding that both answers are very similar but I'm not sure if I should be taking a different approach for them. The first I focus on why it loses significance and the second I focus on the taxonomy, but this also recognizes the loss of significance so I feel like I am almost repeating my answers a little. Any thoughts would be helpful.

lecy commented 5 years ago

Similar impact of adding the controls, but one went from significant to non and one went from negative and significant to positive and significant. So mechanism might be similar, but end results are different.