openml / openml-r

R package to interface with OpenML
http://openml.github.io/openml-r/
Other
95 stars 37 forks source link

Error running cforest on task 3573 #349

Closed HeidiSeibold closed 7 years ago

HeidiSeibold commented 7 years ago

When I run

library("OpenML")

tmnist <- getOMLTask(task.id = 3573)

ctr <- makeLearner("classif.ctree")
restr <- runTaskMlr(tmnist, ctr)
uploadOMLRun(restr)

cfrst <- makeLearner("classif.cforest")
resfrst <- runTaskMlr(tmnist, cfrst)
uploadOMLRun(resfrst)

I get

Loading required package: mlr
Loading required package: ParamHelpers
Downloading from 'https://www.openml.org/api/v1/task/3573' to '/tmp/Rtmp6Ygsyz/cache/tasks/3573/task.xml'.
Downloading from 'https://www.openml.org/api_splits/get/3573/Task_3573_splits.arff' to '/tmp/Rtmp6Ygsyz/cache/tasks/3573/datasplits.arff'
Data '554' file 'description.xml' found in cache.
Data '554' file 'dataset.arff' found in cache.
Loading required package: readr
Removing 65 columns: pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,pixel10,pixel11,pixel12,pixel17,pixel18,pixel19,pixel20,pixel21,pixel22,pixel23,pixel24,pixel25,pixel26,pixel27,pixel28,pixel29,pixel30,pixel31,pixel32,pixel53,pixel54,pixel55,pixel56,pixel57,pixel58,pixel83,pixel84,pixel85,pixel86,pixel112,pixel113,pixel141,pixel169,pixel477,pixel561,pixel645,pixel672,pixel673,pixel674,pixel700,pixel701,pixel702,pixel728,pixel729,pixel730,pixel731,pixel755,pixel756,pixel757,pixel758,pixel759,pixel760,pixel781,pixel782,pixel783,pixel784
Task: mnist_784, Learner: classif.cforest
[Resample] cross-validation iter 1: Error in .setupMethodsTables(fdef, initialize = TRUE) : 
  trying to get slot "group" from an object of a basic class ("NULL") with no slots
Calls: runTaskMlr ... validObject -> as -> .getMethodsTable -> .setupMethodsTables
Execution halted

Computing and uploading the ctree works

library("OpenML")

tmnist <- getOMLTask(task.id = 3573)

ctr <- makeLearner("classif.ctree")
restr <- runTaskMlr(tmnist, ctr)
uploadOMLRun(restr)

My sessionInfo:

> sessionInfo( )
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_CH.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_CH.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_CH.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] readr_1.0.0      OpenML_1.1       mlr_2.10         ParamHelpers_1.9

loaded via a namespace (and not attached):
 [1] parallelMap_1.3   Rcpp_0.12.7       plyr_1.8.4        tools_3.3.2      
 [5] digest_0.6.10     jsonlite_1.1      memoise_1.0.0     tibble_1.2       
 [9] gtable_0.2.0      checkmate_1.8.2   lattice_0.20-34   Matrix_1.2-7.1   
[13] curl_2.3          parallel_3.3.2    mvtnorm_1.0-5     coin_1.1-2       
[17] httr_1.2.1        stats4_3.3.2      grid_3.3.2        data.table_1.10.0
[21] farff_1.0         R6_2.2.0          XML_3.98-1.5      survival_2.40-1  
[25] multcomp_1.4-6    TH.data_1.0-7     ggplot2_2.1.0     codetools_0.2-15 
[29] MASS_7.3-44       backports_1.0.4   scales_0.4.0      BBmisc_1.10      
[33] modeltools_0.2-21 splines_3.3.2     assertthat_0.1    strucchange_1.5-1
[37] colorspace_1.2-6  sandwich_2.3-4    stringi_1.1.1     party_1.0-25     
[41] munsell_0.4.3     zoo_1.7-13     
HeidiSeibold commented 7 years ago

Might possibly help:

https://github.com/hadley/lubridate/issues/314 https://github.com/twitter/AnomalyDetection/issues/76

Problem seemst to be related to methods.

Trying to figure out now whether the problem arises in OpenML, mlr or party.

giuseppec commented 7 years ago

Does this only happen with this task/data? It is hard for me to reproduce this error because the data is "big" (well at least big enough so that the cforest seems to train forever and uses a very large amout of memory...). So it would be great if you could figure out where the problem occurs.

HeidiSeibold commented 7 years ago

It's the first time I've seen this happen.

It is hard for me to reproduce this error because the data is "big" (well at least big enough so that the cforest seems to train forever and uses a very large amout of memory...). So it would be great if you could figure out where the problem occurs.

Doing my best. Will report back as soon as I know more.

giuseppec commented 7 years ago

Maybe you could try if this also happens with a subset of this data set (so you don't have to waint that long training time)?

giuseppec commented 7 years ago

Closing this now as I could not reproduce. However, this might also look more like an mlr issue.