BioSchemas / specifications

Issue tracker, technical wiki, and example markup
https://bioschemas.org
54 stars 52 forks source link

Discussion: How to describe the intended ML task of a training Dataset? #630

Open ljgarcia opened 1 year ago

ljgarcia commented 1 year ago

The ELIXIR Machine Learning Focus Group (including the task force on synthetic data) and NFDI4DataScience (and possible RDA FAIR4ML IG) are interested in using metadata to describe the distribution of a dataset for ML training purposes (including the DOME recommendations for Data).

Please let us know your thoughts on the following properties to describe the ML task it could be used for, probably combined with an EDAM term for Operations

The cons for the properties mentioned would be the lack of support of DefinedTerm. A discussion about extending support for DefinedTerm in schema.org is ongoing