CCICB / CRUX

Other
2 stars 1 forks source link

Time from enrolment treated as categorical #143

Open selkamand opened 9 months ago

selkamand commented 9 months ago

For some custom clinical annotation datasets the clinical metadata some numeric columns are being parsed as categorical. WHen selecting a column for 'time_to_event' we should parse it through as.numeric just in case the guessing of type doesn't work well.

We should also investigate why the type is guessed incorretly when data.table::fread guesses correctly. Something in the type 'fixing' isn't working amazingly

selkamand commented 9 months ago

Plan: will now try and run survival analysis anyway by auto-converting selected column BUT does warn the user with a sweetAlert

selkamand commented 9 months ago

Can be recreated as follows

maf = system.file("test_data/tcga_laml.subsampled.maf.gz", package = "CRUX")
clin = system.file("test_data/tcga_laml_annot.csv", package = "CRUX")

x= maftools::read.maf(maf = maf, clinicalData = clin)
str(x@clinical.data)

# days_to_last_followup should be numeric but is a character

data.table::fread is not the issue - it reads the file correctly. Must be how maftools is processing it. Note we're using maftools_2.12.0 in this version. will need to download dev version of maftools for further testing

selkamand commented 9 months ago

Note we've got this automatic conversion working in #149, Commit 767247b3ea300374e5ed02cf696c13dfd04b4c9f but we do still need to resolve the underlying issue