fatiando / verde

Processing and gridding spatial data, machine-learning style
https://www.fatiando.org/verde
BSD 3-Clause "New" or "Revised" License
597 stars 73 forks source link

Change default scoring to mean squared error #322

Open leouieda opened 3 years ago

leouieda commented 3 years ago

At the moment, the default scoring metric is R², which has the advantage of being unit-less so it's easy to compare. But it also doesn't really tell us much about the actual prediction error to expect. It also doesn't work when we do leave-one-out cross-validation since it results in NaNs (thanks to @dangilbert1337 for finding this).

With Verde 1.6.0, we can specify a different metric for cross_val_score and there is a private function score_estimator that could be used instead of the score method. But it would be much more convenient to get the MSE from score instead of always having to use these other options.

What do people think about making this change? A 👍🏽 👎🏽 here would be appreciated.

This would break backward compatibility so it should be reserved for Verde 2.0. We might want to start a branch for that so we can begin to work on these features instead of letting them sit in the issues.

leouieda commented 3 years ago

If we decide to do this, we should include a FutureWarning in the score method for the next release.