Open ogencoglu opened 7 years ago
Hi @ogencoglu sounds like a cool idea, thanks. Any thought on how to approach clustering them ?
I think it is just manual work. Not all datasets may be suitable for this but many machine learning people search for datasets to try their algorithms/implementations in a smaller scale before going to well-known benchmark datasets.
agreed it'd be nice to filter for regression or classification, but dont see how you could properly categorize datasets.
a regression dataset could be a classification dataset, and vice versa, depending on your preprocessing strategy (eg binning) and target feature.
for example, the canonical iris dataset, used for classification, could be viewed as regression too.
My idea was something similar to UCI data repo: http://archive.ics.uci.edu/ml/datasets.html
The column can be "Default Task". Some datasets may have even both Classification and Regression.
Hi,
It would be nice to have a 3rd column for data() output indicating whether the dataset can be used for regression or classification problems.