statsmodels / statsmodels

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
BSD 3-Clause "New" or "Revised" License
9.76k stars 2.85k forks source link

Current status of the regularized linear regression? #7792

Open aleksejs-fomins opened 2 years ago

aleksejs-fomins commented 2 years ago

Dear StatsModels developers,

Can you please tell me if it is currently possible to do regularized linear regression in statsmodels? According to the documentation, it should be possible to do, but my naive attempts do not work. The following code throws the NotImplemented error upon printing the results summary. If it is possible to use this functionality, can you please suggest how, or direct me to an example notebook. Briefly scanning through the existing example notebooks I was unable to locate anything suitable


import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.stats.api as sms

url = 'http://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat = pd.read_csv(url)

# Fit regression model (using the natural log of one of the regressaors)
resultsREG = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit_regularized(L1_wt=0.0, alpha=1000)

# Inspect the results
print(resultsREG.summary())
josef-pkt commented 2 years ago

params and predict should be available.

standard errors, cov_params and bse are not available for linear model elastic net fit_regularized, so no summary available either.

For variable selection with Lasso, AFAIR, there is an option to return the unpenalized model with the reduced number of explanatory variables.

(For pure L2 regularization, we also have a Ridge class but I don't remember the status.)

aleksejs-fomins commented 2 years ago

Thank you for your reply. That answers my question