scikit-learn-contrib / lightning

Large-scale linear classification, regression and ranking in Python
https://contrib.scikit-learn.org/lightning/
1.73k stars 214 forks source link

FistaRegressor does not converge for real data #151

Open arose13 opened 3 years ago

arose13 commented 3 years ago

I can get the FistaRegressor to converge when data is trivial. The code for the simulated data is below.

from scipy import stats

N, P = 300, 30
m_true = np.zeros(P)
m_true[:4] = [2, -2, 2, 3]

noise = 3 * stats.norm().rvs(N)

data = 3*stats.norm().rvs((N, P))
target = data @ m_true + noise

fista = FistaRegressor(
    C=1/n,
    penalty='l1',
    alpha=lam,   # The same alpha LassoCV.alpha_ finds
    max_iter=1000,
    max_steps=1000,
)
fista.fit(data, target)

image (X axis are the individual coefficients and the Y axis are the fitted coef magnitude)

But if I use real data I cannot get it to converge at all.

from statsmodels.tools.tools import add_constant
from sklearn.datasets import load_boston

data, target = load_boston(return_X_y=True)
data = add_constant(data)

image (X axis are the individual coefficients and the Y axis are the fitted coef magnitude)

I assume it has something to do with feature scaling but I'm not sure.

mblondel commented 3 years ago

Could you indicate how you call FistaRegressor too? Thanks.

arose13 commented 3 years ago

I update the code above to show how I called the FistaRegressor

mathurinm commented 2 years ago

Hello @arose13,

I think it may have to do with a wrong choice of n (undefined in your snippet, by the way) in your code.

The following gives me identical results for lightning and sklearn: image

from scipy import stats
import matplotlib.pyplot as plt
import numpy as np
from lightning.regression import FistaRegressor
from sklearn.linear_model import LassoCV

from statsmodels.tools.tools import add_constant
from sklearn.datasets import load_boston

fig, axarr = plt.subplots(1, 2)

for ax, dataset in zip(axarr, ["simu", "boston"]):
    if dataset == "simu":
        N, P = 300, 30
        m_true = np.zeros(P)
        m_true[:4] = [2, -2, 2, 3]

        noise = 3 * stats.norm().rvs(N)

        data = 3*stats.norm().rvs((N, P))
        target = data @ m_true + noise
    else:
        data, target = load_boston(return_X_y=True)
        data = add_constant(data)
        N = data.shape[0]

    clf_sklearn = LassoCV(fit_intercept=False, tol=1e-10).fit(data, target)

    fista = FistaRegressor(
        C=1/N,
        penalty='l1',
        alpha=clf_sklearn.alpha_,
        max_iter=1_000,
        max_steps=1_000,
    )
    fista.fit(data, target)

    ax.plot(clf_sklearn.coef_, label="sklearn")
    ax.plot(fista.coef_[0], label="Lightning fista", linestyle='--')
    ax.legend()
    ax.set_title(dataset)
plt.show(block=False)