INRIA / scikit-learn-mooc

Machine learning in Python with scikit-learn MOOC
https://inria.github.io/scikit-learn-mooc
Creative Commons Attribution 4.0 International
1.12k stars 516 forks source link

Better terms for predictor, regressor, classifier throughout the course #352

Closed lesteve closed 3 years ago

lesteve commented 3 years ago

This is probably too much jargon, also there is a source for confusion because regressors can also mean features used in a regression (similar thing for predictors) as mentioned in the scikit-learn glossary for example: https://scikit-learn.org/stable/glossary.html#term-predictor

In statistics, "predictors" refers to features.

The terms we agreed on:

lesteve commented 3 years ago

We probably want to remove or update the corresponding entries in the glossary ...

ArturoAmorQ commented 3 years ago

We probably want to remove or update the corresponding entries in the glossary ...

Just a bit awkward that "Classifier" and "Regressor" would still be embedded in KNeighborsClassifier and DecisionTreeRegressor but I cannot find any simple work-around in this case.

lesteve commented 3 years ago

Good point we may want to keep the regressor and classifier entries in the glossary then ...

Another point, probably not that important, but my feeling was also that "regressor"/"classifier" was quite often used to mean a scikit-learn object as in the glossary definition and that "regression model"/"classification model" has a more general meaning.

ArturoAmorQ commented 3 years ago

One option is to keep both terms in the glossary and add a small explanation, for instance

### regression model

A regression model is a [predictive model](#predictive model) in a [regression](#regression)
setting.

In scikit-learn, `DecisionTreeRegressor` and `Ridge` are regression model classes

### regressor

The term "regressor" is quite often used to mean a scikit-learn object implementing a [regression model] (#regression model). 

Note:  "regressors" can also mean features used in a regression 

Or even just keeping the latter definition may suffice. What do you think?

GaelVaroquaux commented 3 years ago

What do you think?

I like your suggestion above!

lesteve commented 3 years ago

The weak consensus from Tuesday meeting is that it was not worth it, too much effort for too little impact.