RaphaelS1 / survivalmodels

Implementations of survival models in R
https://raphaels1.github.io/survivalmodels/
Other
57 stars 13 forks source link

Add example dataset and examples #1

Closed RaphaelS1 closed 9 months ago

Jesse-Islam commented 3 years ago

I'm trying to feed a custom NN into DNNsurv, but I'm not sure what the exact shape is supposed to look like... Would you have a minimal example of what the keras model should look like?

thanks!

RaphaelS1 commented 3 years ago

Happy to add an example but also can you let me know if you read the docs (below) and if so how do you think they could be clearer, e.g. do you think they need better wording or do you think it's too abstract without a separate example?

Documentation for custom model parameter:

Optional custom architecture built with build_keras_net or directly with keras. Output layer should be of length 1 input is number of features plus number of cuts.

Jesse-Islam commented 3 years ago

Yep, Using your internal function build_keras_net seems to work! and I figured

Output layer should be of length 1 input is number of features plus number of cuts.

Would be relatively simple with keras in R. so something like this should work if I understood the documentation, however...

library(survivalmodels)
library(keras)
#> 
#> Attaching package: 'keras'
#> The following object is masked from 'package:survivalmodels':
#> 
#>     install_keras
# common parameters

cutPick<-10
df<-simsurvdata(50)
customModel<-layer_input(shape = c(3+cutPick), name = 'input')%>% 
  layer_dense(units=200,use_bias = T)%>%
  layer_dense(units=1,use_bias = T)%>%
  layer_activation(activation="sigmoid")

g<-dnnsurv(custom_model=customModel, time_variable = "time", status_variable = "status", data = df,
           early_stopping = TRUE, epochs = 100L, validation_split = 0.3,cuts=cutPick)
#> Error in UseMethod("compile"): no applicable method for 'compile' applied to an object of class "c('tensorflow.tensor', 'tensorflow.python.framework.ops.Tensor', 'tensorflow.python.framework.tensor_like._TensorLike', 'python.builtin.object')"

sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] keras_2.3.0.0.9000   survivalmodels_0.1.6
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.6        pillar_1.5.0      compiler_4.0.3    highr_0.8        
#>  [5] base64enc_0.1-3   tools_4.0.3       zeallot_0.1.0     digest_0.6.27    
#>  [9] jsonlite_1.7.2    evaluate_0.14     lifecycle_1.0.0   tibble_3.0.6     
#> [13] lattice_0.20-41   pkgconfig_2.0.3   rlang_0.4.10      Matrix_1.3-2     
#> [17] yaml_2.2.1        xfun_0.21         stringr_1.4.0     dplyr_1.0.2      
#> [21] knitr_1.31        generics_0.1.0    vctrs_0.3.6       rappdirs_0.3.3   
#> [25] tidyselect_1.1.0  grid_4.0.3        reticulate_1.18   glue_1.4.2       
#> [29] R6_2.5.0          KMsurv_0.1-5      fansi_0.4.2       geepack_1.3-2    
#> [33] rmarkdown_2.7     tidyr_1.1.0       purrr_0.3.4       magrittr_2.0.1   
#> [37] whisker_0.4       backports_1.2.0   tfruns_1.5.0      htmltools_0.5.1.1
#> [41] ellipsis_0.3.1    MASS_7.3-53       tensorflow_2.2.0  utf8_1.1.4       
#> [45] stringi_1.5.3     broom_0.7.0       crayon_1.4.1      pseudo_1.4.3

Created on 2021-03-13 by the reprex package (v0.3.0)

So A verbose example might be beneficial here!

So just having a few more examples on the documentation page, one explicitly using build_keras_net and another designing a keras model from scratch. Or if it happens to be a relatively easy thing to clear up in words just adding to the quoted messaged on this thread may be plenty.

RaphaelS1 commented 3 years ago

The object you've provided is not a keras model, see e.g. https://tensorflow.rstudio.com/tutorials/beginners/basic-ml/tutorial_basic_regression/

So when you pass it as a model:

library(survivalmodels)
library(keras)
#> 
#> Attaching package: 'keras'
#> The following object is masked from 'package:survivalmodels':
#> 
#>     install_keras

cutPick<-10
df<-simsurvdata(50)

 input <- layer_input(shape = c(3L + 10L), name = 'input')

 output<-input %>%
   layer_dense(units=200L,use_bias = T)%>%
   layer_dense(units=1L,use_bias = T)%>%
   layer_activation(activation="sigmoid")

    model <- keras_model(input, output)

dnnsurv(custom_model=model, time_variable = "time", status_variable = "status", data = df,
           early_stopping = TRUE, epochs = 100L, validation_split = 0.3,cuts=cutPick)
#> 
#>  DNNSurv Neural Network 
#> 
#> Call:
#>   dnnsurv(data = df, time_variable = "time", status_variable = "status",      cuts = cutPick, custom_model = model, early_stopping = TRUE,      epochs = 100L, validation_split = 0.3)
#> 
#> Response:
#>   Surv(time, status)
#> Features:
#>   {sexF, age, trt}

I have a split opinion here and I am very happy and interested to hear your thoughts. On the one hand, this package is incredibly lightweight, it just interfaces a few models in R and mainly relies on users already knowing a lot about keras, the ANNs themselves, and reticulate. The reason for that is mainly because I use these models myself via mlr3proba so never have to think about it too much (i.e. the heavy lifting is done internally). However as this is the only implementation of survival neural networks in R (that I'm aware of) there is definitely a very good argument for making the package more user-friendly by updating the docs, making more examples, etc. I worry that there will never be enough docs to do keras any justice, especially when it comes to building custom models, this isn't the place to ask questions about how to do this...but again it's the only place in R for survival networks!!! So very split opinions,....

Jesse-Islam commented 3 years ago

Ah well that makes me feel silly, thanks for the clarification! I was trying to recreate what seemed to be occurring in build_keras_net but clearly misunderstood what needed to be returned.

In terms of updating the docs, what you've done here seems plenty. having this example at the end of the DNNsurv documentation is ideal since it lets the user know exactly what you mean by keras model. Of course doing anything more complicated would be up to the user, but from here it becomes clear how you may scale the model.

There's also the option of "let the user look at the issues on github" approach, where they can now see this example!

RaphaelS1 commented 3 years ago

Ah well that makes me feel silly, thanks for the clarification! I was trying to recreate what seemed to be occurring in build_keras_net but clearly misunderstood what needed to be returned.

Well definitely don't feel silly because it doesn't actually say in the docs what the required object is (Edit: it actually does)!!! So this example highlighted the problem and also provided a good link for me to reference in the docs.

In terms of updating the docs, what you've done here seems plenty. having this example at the end of the DNNsurv documentation is ideal since it lets the user know exactly what you mean by keras model. Of course doing anything more complicated would be up to the user, but from here it becomes clear how you may scale the model.

Will do!

There's also the option of "let the user look at the issues on github" approach, where they can now see this example!

Very few people do this.... :(