openml / OpenML

Open Machine Learning
https://openml.org
BSD 3-Clause "New" or "Revised" License
662 stars 91 forks source link

Meta-information on flows #323

Open mfeurer opened 8 years ago

mfeurer commented 8 years ago

It would be helpful to have meta-information on flows, such as:

This would allow the user to know what to do with a flow.

joaquinvanschoren commented 8 years ago

Sounds good. Would you want users/clients to upload this information? Or would you automatically want to deduce this from the uploaded runs?

Can point 3 be deduced from point 2 (the task sort of defines the output)?

On Tue, Sep 13, 2016 at 11:13 AM Matthias Feurer notifications@github.com wrote:

It would be helpful to have meta-information on flows, such as:

  • do they work on sparse data
  • which tasks do they work on
  • what is their output (some flows might only output a transformation of the input space)

This would allow the user to know what to do with a flow.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openml/OpenML/issues/323, or mute the thread https://github.com/notifications/unsubscribe-auth/ABpQVxX37u2cZ-bZixE0QuGPr4-DvVPJks5qpmkrgaJpZM4J7c9u .

mfeurer commented 8 years ago

I wouldn't know how to deduce such information automatically. Let's take scikit-learn and the current prototype as an example:

Thus, I think the toolkits have to upload these information.

joaquinvanschoren commented 8 years ago

If the toolkits can provide this information, that would be great. We could add a number of non-required(?) fields that the toolkits could fill in as well as possible. For instance, based on Matthias comments:

It would be good to brainstorm a bit about this list so we could add it at once.

On Tue, Sep 13, 2016 at 11:28 AM Matthias Feurer notifications@github.com wrote:

I wouldn't know how to deduce such information automatically. Let's take scikit-learn and the current prototype as an example:

  • A parameter search flow could either be used for regression or classification, it depends on the components.
  • A scaling flow could either be used for regression or classification, it depends on the whether it is used as a component in a regression or classification flow.

Thus, I think the toolkits have to upload these information.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/openml/OpenML/issues/323#issuecomment-246626191, or mute the thread https://github.com/notifications/unsubscribe-auth/ABpQV4txyyrgG5IRgOM-pdhkks9lU7reks5qpmyfgaJpZM4J7c9u .

mfeurer commented 8 years ago

A few comments: