ijyliu / ECMA-31330-Project

Econometrics and Machine Learning Group Project
2 stars 1 forks source link

Metrics of success for the estimator #17

Closed ijyliu closed 3 years ago

ijyliu commented 3 years ago

Bias and consistency, maybe variance

Any other ideas? Something involving confidence intervals?

ijyliu commented 3 years ago

Given the fact that the first principal component is not even unique up to a scalar, it's not clear to me we can measure success based on coefficient values. If the approach in #25 doesn't work out, maybe we need to state things in terms of R^2 or predictive power?

paul-opheim commented 3 years ago

That makes sense to me. Reporting correlations would also be another option.

ijyliu commented 3 years ago

I feel like this means like half of the original idea about attenuation bias has to be thrown out and now I’m struggling to see why anyone would really want to do this because maximizing R2 never really seems like a goal

On Mon, May 10, 2021 at 10:49 AM marionoro @.***> wrote:

That makes sense to me. Reporting correlations would also be another option.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836792945, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4PPKF5PWLMOUVC2SG3TM7W7FANCNFSM44N4DOJA .

paul-opheim commented 3 years ago

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

ijyliu commented 3 years ago

What you mentioning is the coefficients we have.

On Mon, May 10, 2021 at 12:16 PM marionoro @.***> wrote:

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836906394, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4OUMSKS67OAONOV653TNABFNANCNFSM44N4DOJA .

ijyliu commented 3 years ago

We can’t really compare our coefficient values to other methods due to the scaling problem currently

On Mon, May 10, 2021 at 12:16 PM marionoro @.***> wrote:

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836906394, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4OUMSKS67OAONOV653TNABFNANCNFSM44N4DOJA .

ijyliu commented 3 years ago

I have been standardizing the whole time

On Mon, May 10, 2021 at 12:16 PM marionoro @.***> wrote:

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836906394, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4OUMSKS67OAONOV653TNABFNANCNFSM44N4DOJA .

paul-opheim commented 3 years ago

What other methods are you referring to?

ijyliu commented 3 years ago

OLS with the mismeasured variable

(Also iv I we come back to that)

On Mon, May 10, 2021 at 12:26 PM marionoro @.***> wrote:

What other methods are you referring to?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836921188, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4PYJWGCMGINO5YAQ2LTNACMPANCNFSM44N4DOJA .

nicomarto commented 3 years ago

I really don't get the discussion

nicomarto commented 3 years ago

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

Yes, this is done in many papers reporting treatment effects

ijyliu commented 3 years ago

@nicomarto that’s what we are doing. But because of the scaling problem (the first PC isn’t unique because it is multiplied by a scalar) the coefficients often have negative and strange signs. See Paul’s jupyter notebook from yesterday

Hence, we can’t compare mismeasured ols to the pcr coefficients

On Mon, May 10, 2021 at 1:13 PM nicomarto @.***> wrote:

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

Yes, this is done in many papers reporting treatment effects

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836988273, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4LLCQIZETSGL7PQF2TTNAH5LANCNFSM44N4DOJA .

ijyliu commented 3 years ago

Unless we try #25 and it works

On Mon, May 10, 2021 at 1:31 PM Isaac Liu @.***> wrote:

@nicomarto that’s what we are doing. But because of the scaling problem (the first PC isn’t unique because it is multiplied by a scalar) the coefficients often have negative and strange signs. See Paul’s jupyter notebook from yesterday

Hence, we can’t compare mismeasured ols to the pcr coefficients

On Mon, May 10, 2021 at 1:13 PM nicomarto @.***> wrote:

Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method?

Yes, this is done in many papers reporting treatment effects

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836988273, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4LLCQIZETSGL7PQF2TTNAH5LANCNFSM44N4DOJA .

paul-opheim commented 3 years ago

Hmm interesting. Being multiplied by a scalar seems like it would be fixed by normalizing to a 1-unit variance, unless the scalar is negative causing the coefficient to be the opposite of what it should be. This latter issue seems like it would be relatively easy to account for. One possible solution would be to force the ordering of the PC to roughly match the ordering of the initial Euclidean distances of the mismeasured x-variables.

nicomarto commented 3 years ago

Oh, i

@nicomarto that’s what we are doing. But because of the scaling problem (the first PC isn’t unique because it is multiplied by a scalar) the coefficients often have negative and strange signs. See Paul’s jupyter notebook from yesterday Hence, we can’t compare mismeasured ols to the pcr coefficients On Mon, May 10, 2021 at 1:13 PM nicomarto @.***> wrote: Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method? Yes, this is done in many papers reporting treatment effects — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#17 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4LLCQIZETSGL7PQF2TTNAH5LANCNFSM44N4DOJA .

Oh, it is about Paul findings. Yes, we see coefficients with negatives sign bc of PCA. But given that we are dealing with attenuation bias I am not worried to be honest, given that we have to deal with absolute values. I would apply absolute value to the coefficients and then compare the magnitudes

ijyliu commented 3 years ago

I don’t know how you justify that theoretically

On Mon, May 10, 2021 at 7:16 PM nicomarto @.***> wrote:

Oh, i

@nicomarto https://github.com/nicomarto that’s what we are doing. But because of the scaling problem (the first PC isn’t unique because it is multiplied by a scalar) the coefficients often have negative and strange signs. See Paul’s jupyter notebook from yesterday Hence, we can’t compare mismeasured ols to the pcr coefficients … <#m-4388658534413727010> On Mon, May 10, 2021 at 1:13 PM nicomarto @.***> wrote: Hmm interesting. Wouldn't it also be meaningful to show how a standard deviation variation in the PCA-value leads to a x standard deviation change in the dependent variable? I feel like results in papers are often reported like that, and that seems doable with this method? Yes, this is done in many papers reporting treatment effects — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#17 (comment) https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-836988273>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4LLCQIZETSGL7PQF2TTNAH5LANCNFSM44N4DOJA .

Oh, it is about Paul findings. Yes, we see coefficients with negatives sign bc of PCA. But given that we are dealing with attenuation bias I am not worried to be honest, given that we have to deal with absolute values. I would apply absolute value to the coefficients and then compare the magnitudes

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ijyliu/ECMA-31330-Project/issues/17#issuecomment-837479814, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4LCVB3GBGUQ4V77JTDTNBSLVANCNFSM44N4DOJA .

nicomarto commented 3 years ago

Shouldnt be hard... with PCA we are extracting the true variable or the negative of the true variable, then the coefficients we get are either beta or -beta, and since we care about the attenuation bias the absolute value is what matters

ijyliu commented 3 years ago

I think the transformation might be better justified. See #25

ijyliu commented 3 years ago

In the empirical results, the R^2 was lower using PCA relative to including the all the mismeasured covariates (which makes sense because you are only summarizing all those variables and PCA always explains only a fraction of the original variance). So I don't think we can really say prediction is better.

paul-opheim commented 3 years ago

Yeah that makes sense.

ijyliu commented 3 years ago

In the simulations, I think we discuss bias and variance, in the empirics we discuss the size of coefficients/bias.

I don't know if anyone wants to discuss consistency?

@nicomarto

paul-opheim commented 3 years ago

If we end up including an estimation subsection in the theory section, then this will go in there.