Open motiwari opened 2 years ago
The thesis https://orbi.uliege.be/bitstream/2268/170309/1/thesis.pdf does not have an explicit list of datasets. Could scan through it for useful ones.
A cursory look at review articles in recent years suggests that there isn't a set of canonical tasks for tree-based algorithms.
These sets of datasets can be useful: https://scikit-learn.org/stable/datasets/toy_dataset.html https://scikit-learn.org/stable/datasets/real_world.html https://scikit-learn.org/stable/modules/classes.html#module-sklearn.datasets https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html
Ideally the algorithm is agnostic to the dataset and including more datasets is trivial.
This one seems pretty popular too: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
Need a few classification + regression ones