Open frederickz opened 4 years ago
@zhuyuming96 . Python has three primary packages for running linear regressions, other than doing the linear algebra approach of inv(X'X)(X'Y)
. You can find examples of how to use the first two methods in QuantEcon's linear regression notebook section on Simple Linear Regression and on Endogeneity. For the third method, I have some examples in my classification (discrete choice) notebook from MACS 30150.
statsmodels.api
package is probably the most similar to STATA. See Simple Linear Regression section of QuantEcon notebook on "Linear Regression in Python".
import statsmodels.api as sm
reg1 = sm.OLS(endog=df1['logpgp95'], exog=df1[['const', 'avexpr']], missing='drop')
results = reg1.fit()
print(results.summary())
2. Similar to `statsmodels.api` is the `linearmodels.iv` package for instrumental variables models. See [Endogeneity](https://python.quantecon.org/ols.html#Endogeneity) section of QuantEcon notebook on "Linear Regression in Python".
```python
from linearmodels.iv import IV2SLS
iv = IV2SLS(dependent=df4['logpgp95'],
exog=df4['const'],
endog=df4['avexpr'],
instruments=df4['logem4']).fit(cov_type='unadjusted')
print(iv.summary)
scikit-learn
also has some regression commands, but it is more about point estimates and prediction than about standard errors. It is harder to get the standard errors and do hypothesis testing. See classification (discrete choice) notebook from Dr. Evans' MACS 30150 class.
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
LinReg = LinearRegression() LinReg.fit(X_train, y_train) y_LinReg_pred = LinReg.predict(X_test)
LogReg = LogisticRegression() LogReg.fit(X_train, y_train) y__LogReg_pred = LogReg.predict(X_test)
Thank you so much, especially all the resources you provide! These really help.
I am pretty interested in adapting regression method into the initial guesses. Is sklean a good package? Thanks.