zellerlab / siamcat

R package for Statistical Inference of Associations between Microbial Communities And host phenoType
https://siamcat.embl.de/
51 stars 16 forks source link

make model size smaller #13

Closed jakob-wirbel closed 4 years ago

jakob-wirbel commented 4 years ago

For glmnet models, 90% of the model size comes from the stored call in the learner.model$glmnet.fit part of the wrapped model. I don't think the call is needed and we could just delete it within the siamcat function already in order to have smaller models, which do not take too much space when exported as RData object

jakob-wirbel commented 4 years ago

Does not seem needed for lasso_ll (or rather classif.LiblineaRL1LogReg) models, since the call is not stored in the learner.model wrapped object.

Actually, here are example object.size results for the list of models using the data in the vignette:

# A tibble: 3 x 2
  ml.method    size  
  <chr>        <chr>  
1 lasso        4.1 Mb 
2 lasso_ll     0.8 Mb 
3 randomForest 11.3 Mb

Unfortunately, i did not see a way to reduce the size of the random forest models :-(

But, the size of the lasso models could be decreased from 4.1 Mb to 2 Mb