hageldave / DimRedDatasets

Ready to use [Java]: Multidimensional data sets commonly used to benchmark dimensionality reduction.
MIT License
1 stars 0 forks source link

Data Set List #1

Open hageldave opened 2 years ago

hageldave commented 2 years ago

Data sets that are included or should be included in the future.

Regular Data

Timeseries

hageldave commented 1 year ago

I added the breast cancer wisconsin dataset (it actually consists of 3 different datasets) to the list. @lvcarx would be great if you could include these. The *.names files are the descriptions of the respective *.data dataset file. There are also missing values in the datasets, and I think we should simply filter these out (i.e. ignore the whole row then).