openml / openml-r

R package to interface with OpenML
http://openml.github.io/openml-r/
Other
95 stars 37 forks source link

Task conversion fails in stratified CV when a class is smaller than number of folds #450

Open ja-thomas opened 4 years ago

ja-thomas commented 4 years ago
task = convertOMLTaskToMlr(getOMLTask(2073))

Error in instantiateResampleInstance.CVDesc(desc, length(ci), task) : 
  Cannot use more folds (10) than size (5)!

The mlr task:

Browse[2]> mlr.task
Supervised task: yeast
Type: classif
Target: class_protein_localization
Observations: 1484
Features:
   numerics     factors     ordered functionals 
          8           0           0           0 
Missings: FALSE
Has weights: FALSE
Has blocking: FALSE
Has coordinates: FALSE
Classes: 10
CYT NUC MIT ME3 ME2 ME1 EXC VAC POX ERL 
463 429 244 163  51  44  35  30  20   5 
Positive class: NA

resample desc:

Browse[2]> estim.proc

Estimation Method :: crossvalidation
        Parameters:
                number_repeats = 1
                number_folds = 10
                stratified_sampling = true

On the Python side this seems to be handled (somehow)