UBC-DSCI / introduction-to-datascience-python

Open Source Textbook for DSCI100: Introduction to Data Science in Python
https://python.datasciencebook.ca
Other
10 stars 7 forks source link

Show `best_params_` for classification? #281

Closed ttimbers closed 9 months ago

ttimbers commented 9 months ago

in the Python version, for classification we don't save the object that we would ask for best_params_ from, so unless students read the regression chapter and extrapolate to that they will have a hard time. They will think they need to make a data frame of all the parameters and their scores, and sort it and grab the value.

In some sense, it is good to get the values from the data frame, as you should really look at more than just one number, but if you do agree with the best param suggested by the CV procedure, then programmatically its nice to have these.

ttimbers commented 9 months ago

This also might be a nice improvement for the R book for tidymodels if a similar attribute exists after CV tuning

trevorcampbell commented 9 months ago

This may be a duplicate of #205 -- I will check this and that previous issue later

trevorcampbell commented 9 months ago

yep duplicate, but i'll leave this open for now until there's a PR or otherwise closure

joelostblom commented 9 months ago

for classification we don't save the object that we would ask for bestparams from

We actually do save the object since .fit is an inplace operation and we call it when getting the results dataframe:

image (I added the last line just to show that cancer_tune_grid is modified in place)

I agree that it would be nice to show best_params_ explicitly here as per #205 and I also think we should be more explicit and consistent when using fit as per https://github.com/UBC-DSCI/introduction-to-datascience-python/issues/233 and #134.