CCICB / CRUX

Other
2 stars 1 forks source link

Time from enrolment treated as categorical #143

Open selkamand opened 11 months ago

selkamand commented 11 months ago

For some custom clinical annotation datasets the clinical metadata some numeric columns are being parsed as categorical. WHen selecting a column for 'time_to_event' we should parse it through as.numeric just in case the guessing of type doesn't work well.

We should also investigate why the type is guessed incorretly when data.table::fread guesses correctly. Something in the type 'fixing' isn't working amazingly

selkamand commented 11 months ago

Plan: will now try and run survival analysis anyway by auto-converting selected column BUT does warn the user with a sweetAlert

selkamand commented 11 months ago

Can be recreated as follows

maf = system.file("test_data/tcga_laml.subsampled.maf.gz", package = "CRUX")
clin = system.file("test_data/tcga_laml_annot.csv", package = "CRUX")

x= maftools::read.maf(maf = maf, clinicalData = clin)
str(x@clinical.data)

# days_to_last_followup should be numeric but is a character

data.table::fread is not the issue - it reads the file correctly. Must be how maftools is processing it. Note we're using maftools_2.12.0 in this version. will need to download dev version of maftools for further testing

selkamand commented 11 months ago

Note we've got this automatic conversion working in #149, Commit 767247b3ea300374e5ed02cf696c13dfd04b4c9f but we do still need to resolve the underlying issue