openml / openml-r

R package to interface with OpenML
http://openml.github.io/openml-r/
Other
95 stars 37 forks source link

make custom flows/mlr learners reproducible #294

Closed giuseppec closed 7 years ago

giuseppec commented 7 years ago

Currently people can write their own mlr learners and create flows for custom mlr learners, however, since the makeRLearner, trainLearner and predictLearner S3 function for custom flows are not uploaded on OpenML, other people can't reproduce runs created by custom flows. We need a way how we could handle this properly in OpenML, @berndbischl @jakobbossek any ideas/suggestions?

giuseppec commented 7 years ago

Is it sufficient to also upload the makeRLearner.regr.mycustomlearner, trainLearner.regr.mycustomlearner and predictLearner.regr.mycustomlearner objects together with the learner object?

HeidiSeibold commented 7 years ago

Would be great if this could be discussed soon, please.

I wanna test the new ctree: https://github.com/HeidiSeibold/sandbox/blob/master/rstuff/new_ctree_mlr.R

giuseppec commented 7 years ago

@HeidiSeibold you can still test this. Just create your own mlr fork that includes your custom learner and do something like to at least ensure that this is "reproducible" when people read:

flow = convertMlrLearnerToOMLFlow(makeLearner("classif.heidis.ctree"))
flow$description = "Please use the mlr version from LinkToHeidisMLRFork"
uploadOMLFlow(flow)

Or is something else blocking you?

HeidiSeibold commented 7 years ago

I didn't want to spam OpenML with stuff that isn't standardized. I'll just try to get it to a reasonably reproducible level. To reproduce people need

  1. the r-forge link to the package with correct version.
  2. the correct version link to the mlr add-on.

  1. Info is added as
    makeRLearner.classif.newctree = function() {
    makeRLearnerClassif( ...,
    note = "Devel partykit package revision 1034: https://r-forge.r-project.org/scm/viewvc.php/pkg/devel/partykit/?root=partykit&pathrev=1034"
    )
    }
  2. is added as
    flow$description = "Please use the mlr add-on code https://github.com/HeidiSeibold/sandbox/blob/ed03326adacf4469f994b0c23ac4ecb0cb013ba3/rstuff/new_ctree_mlr.R"

Would that work? Should both links be in the flow$description?

giuseppec commented 7 years ago

Looks good to me. You could also add both links to the flow description, they don't hurt anybody. And don't worry because of "spamming" you are doing it much more properly than you think.

giuseppec commented 7 years ago

I don't think we will support this with mlr. However, this might be possible with mlrng. Therefore closing for now.