openml / meta

Repository for issues which are not for any one specific repository (e.g., governance, data models)
0 stars 0 forks source link

Allow easy uploading of image datasets to OpenML #7

Open PGijsbers opened 3 weeks ago

PGijsbers commented 3 weeks ago

In general, image datasets currently consist of a header table with a directory of files. So a "File Dataset" may be more apt.

PGijsbers commented 3 weeks ago

From a related item: We are currently with some prototypes of downloading bucket content with many images, however there are many things left unspecified (and unsupported by packages), e.g.:

dataset upload (adding auxiliary files in general) how should the files be zipped/unzipped, and how can we know at download time how to resolve the paths? how should we store metadata/which metadata should be store (different types of tasks, bounding boxes, segmentation masks, etc.) add the relevant documentation More generally should extend to parquet file describing other files (images, audio, video).