scikit-learn / scikit-learn

scikit-learn: machine learning in Python
https://scikit-learn.org
BSD 3-Clause "New" or "Revised" License
60.15k stars 25.41k forks source link

inconsistent attribute naming "cv_mse_path_" and "mse_path_" in LassoLarsCV and LassoCV respectively #6251

Closed DSLituiev closed 2 years ago

DSLituiev commented 8 years ago

There is inconsistent attribute names in cross-validation Lasso models:

cv_mse_path_ in LassoLarsCV code

mse_path_ in LassoCV code

as seen already in this tutorial

unification suggested

DSLituiev commented 8 years ago

Similarly:

LassoCV -> alphas_ seemingly corresponds to LassoLarsCV -> cv_alphas_ (judging from the above-mentioned tutorial and correspondence in shape to the cv_mse_path_)

agramfort commented 8 years ago

are the shape of these attributes the same for both classes?

I agree it should be made consistent.

marking as easy

IshankGulati commented 8 years ago

Can I take this issue?

agramfort commented 8 years ago

yeah give it a try

DSLituiev commented 8 years ago

also it were good to generalize MSE to and abstract score (e.g. R^2), like in this example

http://scikit-learn.org/stable/auto_examples/model_selection/plot_train_error_vs_test_error.html#example-model-selection-plot-train-error-vs-test-error-py

probably the best name would be test_score_path, and error must be fed to a generalized score function

agramfort commented 8 years ago

sorry I don't get what you suggest

DSLituiev commented 8 years ago

I suggest to account for possibility to include general performance score while chosing the name, not only MSE. On Jan 31, 2016 12:12 AM, "Alexandre Gramfort" notifications@github.com wrote:

sorry I don't get what you suggest

— Reply to this email directly or view it on GitHub https://github.com/scikit-learn/scikit-learn/issues/6251#issuecomment-177429737 .

agramfort commented 8 years ago

how would you do this in terms of API?

DSLituiev commented 8 years ago

I am thinking one can use RegressorMixin class as parent (like ElasticNetCV does) to make things standard. It defines score as R2, which is monotonic decreasing function of MSE R^2= 1 - MSE/ var(y) and is more interpretable metric for regression than raw MSE.

One will need to replace MSE to score defined in the parent RegressorMixin (which is sklearn.metrics.r2_score) and take argmax rather than argmin. In LarsCV and in LassoCV alike.

sorry that it diverges too deep from the original topic, but the longer one looks, the more inconsistencies become visible

DSLituiev commented 8 years ago

I realised that ElasticNetCV actually uses MSE itself, and has mse_path_ inherited from LinearModelCV. OK. I suggest just change everything to mse_path_ and probably add a property r2_path_ (maybe it makes sense to implement it as a @property method).

agramfort commented 8 years ago

I'd just change to mse_xxx

raghavrv commented 8 years ago

@IshankGulati If you are not planning to continue, could @gannyvenkat take this up?

IshankGulati commented 8 years ago

@rvraghav93 I was busy last week. I would like to give it a try again.

raghavrv commented 8 years ago

Okay please take your time! (Just wanted to know if you were still interested)

raghavrv commented 8 years ago

Whoops there was an open Pull Request. I totally missed it. Apologies :)

magicmathmandarin commented 5 years ago

I got an error: AttributeError: 'LassoLarsCV' object has no attribute 'cv_msepath' The version of sklearn I am using is '0.21.1'.

CentralLT commented 4 years ago

@raghavrv what happened with the pull request? is this issue solved?

bhatiak2050 commented 4 years ago

I used msepath instead of cv_msepath and it worked

cmarmo commented 2 years ago

I'm closing this one as it seems that it have been fixed in #6252. Feel free to reopen if I am wrong.