EpistasisLab / pmlb

PMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
https://epistasislab.github.io/pmlb/
MIT License
805 stars 135 forks source link

Add tidymodels datasets #104

Open trangdata opened 4 years ago

trangdata commented 4 years ago

We already have some of these small datasets, but others might be great to add to our collection: https://github.com/tidymodels/modeldata

alexzwanenburg commented 2 years ago

Do you know which of the datasets are already in PMLB?

trangdata commented 2 years ago

Do you know which of the datasets are already in PMLB?

Good question. I don't think we know exactly if there is any overlap with their data list, but maybe it's worth doing a quick name/nrow/ncol matching or something.