mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
947 stars 85 forks source link

Remove Boston Housing #1186

Closed mfeurer closed 2 weeks ago

mfeurer commented 1 month ago

scikit-learn removed the Boston Housing dataset because it promotes racism: https://github.com/scikit-learn/scikit-learn/issues/16155

I hereby suggest to drop the Boston Housing dataset from MLR3.

Arnold-Kakas commented 1 month ago

+1. I don't know if there is some good candidate for replacement, but I have publicly available dataset here: https://www.kaggle.com/datasets/arnoldkakas/real-estate-dataset It is not cleaned, contains duplicates and lot of missing values (all details are in the description). But after some cleaning and geocoding it could be also used for spatial analysis and modelling.

berndbischl commented 4 weeks ago

yes, will do, we are in agreement here.

we will (unfortunately) also check the book and the gallery. for the former i am pretty sure we dont use it (i hope!!!!) for the latter we will see.

be-marc commented 2 weeks ago

@Arnold-Kakas Thank you for the offer. We have already prepared the ames_housing data set in our mlr3data package and use it as a replacement.

be-marc commented 2 weeks ago

Done #1197