openml / openml.org

New OpenML website
https://new.openml.org
BSD 3-Clause "New" or "Revised" License
24 stars 18 forks source link

Dataset Comparison tool #322

Open joaquinvanschoren opened 7 months ago

joaquinvanschoren commented 7 months ago

Proposed by @ogrisel - A 'comparison' view to see how two datasets differ, including for instance:

Possible approach: the new dataset table view allows users to select rows and do action on the selected datasets. 'Compare' could be one such action.

ogrisel commented 7 months ago

Thanks for opening this feature request. A related feature request would be to ask the dataset uploaders to better trace the lineage of their uploads.

For instance by linking to a public git repo with a script that can reproduce the version of the data uploaded to openml.org from the original raw data (if publicly available on another website).

Similarly, when uploading a new version, it would be helpful to document the relevant changes in such a script.