Closed ijyliu closed 3 years ago
1) i ended up taking the exponential instead of the log because, as you say, it gives us a change in the order of magnitude but it doesn't have the complications of taking the log (what do we do when the number is negative?).
2) I also took the transformation of the value with the error because, like you say, it makes it so that we don't have to mess with the errors a bunch.
3) I interpreted the rescaling as being "and/or" with the exponential, so taking the exponential seemed good enough to me.
Why is there a -1 in the IV formula?
I guess maybe you can do the absolute value and then the log, but I agree that seems weird.
Where is there a -1?
second line
Also, potentially dumb question, but what are the ppts in the APEs part of the tables?
Ah. The -1 is so that the regression does not include an intercept.
The ppts in the parentheses are the standard deviations of the absolute percentage error. The idea being that the main number is the mean and then the number in the parentheses is the standard deviation (for the coefficient and the APE). Perhaps I should say that somewhere?
Oh, I might need to fix some regressions then
I don't know that the standard deviation of the APE makes much sense conceptually? I mean, we have the standard deviation of the actual coefficient.
Ah, that's a fair point. I'll just report the MAPEs then.
Ok, the new automated code just does the MAPEs
I need to go check those intercepts now
In simulations code:
In empirical code
Did you remove the intercept from the first stage of the IV when you ran the n=3,000 simulations? It doesn't seem like it based on Run_Simulations.ipynb, but I also don't know if we should have an intercept in that regression?
No, I didn't remove it from the first stage in the simulations. I don't know if we should.
remaining points in issue obsoleted by #87 as we will be using statsmodels IV