hageldave / DimRedDatasets

Ready to use [Java]: Multidimensional data sets commonly used to benchmark dimensionality reduction.
MIT License
1 stars 0 forks source link

PalmerPenguins data array should only contain the measurements #3

Open hageldave opened 1 year ago

hageldave commented 1 year ago

https://github.com/hageldave/DimRedDatasets/blob/97956d33c07ed27b03b55efa0f76f42ffc144b4c/proj/src/main/java/hageldave/dimred/datasets/regular/PalmerPenguins.java#L86

In the code the whole table with 8 columns is put as data array. But only 4 of the columns contain the measurements (bill length, weight, ...) and the others are the different class/category memberships. The data field should only contain the 4 columns corresponding to measurements. Arrays.copyOfRange should do the trick. But also the names of these 4 colums should be put into a String[] field of the class.

hageldave commented 1 year ago

Oh I guess one of the colums corresponds to the year and not to a category or measurement. Maybe that can get a separate int[].