Override precompute check in LassoCV

kara-liu commented 4 months ago

Describe the workflow you want to enable

I am trying to use precompute=True for LassoCV. To save memory, I am passing in the inputs as float32's. However, I get an error that the Gram matrix precompute didn't match the true Gram matrix, where the error is some small epsilon like 1e-5 (see photo below).

Describe your proposed solution

It would be great to override the Gram check and allow for whatever was precomputed to be used.

Describe alternatives you've considered, if relevant

Additional context

No response

ogrisel commented 4 months ago

We could increase the relative tolerance rtol used for float32 in _check_precomputed_gram_matrix from 1e-4 to 1e-3.

We could also turn this exception into a warning instead.

Feel free to open a PR.

Jaimin020 commented 4 months ago

Hii, @ogrisel It's better to convert this exception into a warning. I will open a PR for this.

Tialo commented 2 months ago

@Jaimin020 Are you still interested in opening a PR?

sqali commented 1 month ago

Hi @ogrisel ,

I am not sure if this has been fixed yet. Can you please confirm so I can go ahead?

ogrisel commented 1 month ago

Probably not. Please start by writing a minimal reproducer. If you can reproduce the problem, then you can adapt it as a non regression test for your subsequent PR.

sqali commented 1 month ago

Thanks for the input @ogrisel , will do!!

sqali commented 1 month ago

Hi @ogrisel ,

For the case when fit_intercept=True, the model ignores the user provided gram matrix so I am going with the case of fit_intercept=False.

Although I was unable to reproduce the exact type of error. I am highlighting a few scenarios that I went through. I have attached the jupyter file for your reference.

Case 1:

Sample Dataset generated with 10 data points and 2 features
precompute gram matrix provided
fit_intercept=False

Error

ValueError: Gram matrix passed in via 'precompute' parameter did not pass validation when a single element was checked please check that it was computed properly. For element (1,1) we computed 4.250590801239014 but the user-supplied value was 4.762086868286133

Case 2:

Sample Dataset generated with 10 data points and 2 features
Input data mean centered
precompute gram matrix provided
fit_intercept=False

Error

ValueError: Gram matrix passed in via 'precompute' parameter did not pass validation when a single element was checked - please check that it was computed properly. For element (1,1) we computed 0.628188967704773 but the user-supplied value was 0.6334664225578308.

Case 3:

Sample Dataset generated with 10 data points and 2 features
Input data mean centered & scaled
precompute gram matrix provided
fit_intercept=False

Error

ValueError: Gram matrix passed in via 'precompute' parameter did not pass validation when a single element was checked - please check that it was computed properly. For element (1,1) we computed 9.9166898727417 but the user-supplied value was 10.0

lasso_cv_scenarios.pdf

I think reducing the rtol as you mentioned will handle the issue @kara-liu is facing but I am a little skeptical about turning the exception into a warning as you can see above that there cases where the the values are not within the tolerance level and significantly differ. Kindly advise.

We could increase the relative tolerance rtol used for float32 in _check_precomputed_gram_matrix from 1e-4 to 1e-3.

We could also turn this exception into a warning instead.

Feel free to open a PR.

sqali commented 4 weeks ago

Hi @ogrisel ,

Kindly advise.

sqali commented 3 weeks ago

Hi @ogrisel ,

Kindly review.

scikit-learn / scikit-learn