Closed lhjohn closed 4 years ago
Hi Henrik,
Thanks for the info - I debugged the package and it was an issue with ATLAS creating then nfold as an integer but timeSplitter expecting a double. I've fix it in the latest Andromeda version here (we will be moving to this soon): https://github.com/OHDSI/PatientLevelPrediction/tree/sqlLite/R
I noticed a lot of people being dropped due to requiring full time at risk - we are submitting a paper soon that shows this can cause bias - so you might want to remove the time at risk restriction (will give you more data as well)
Best wishes, Jenna
Thanks for looking into this Jenna,
I also changed the cohort to now include patients with index date before 01-01-2014 instead of 01-01-2015, which should guarantee that almost every patient can have 5 years TAR and do not get dropped, because the database may not have data yet until 31-12-2019.
Is this a good fix for the problem or would you suggest to remove the TAR restrictions for people that do not experience the outcome?
Describe the bug My prediction models do not fit when using prediction packages and selecting "time" split in ATLAS. I could reproduce this on two machines. Could this be my data? Are there some prerequisites for a time split that I forgot about?
Set up (please run in R "sessionInfo()" and copy the output here): R version 3.5.3 (2019-03-11) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
To Reproduce https://epi.jnj.com/atlas/#/prediction/175
PLP Log File 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction Patient-Level Prediction Package version 3.0.16 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction AnalysisID: Analysis_1 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction CohortID: 16316 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction OutcomeID: 7414 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction Cohort size: 2346252 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction Covariates: 10 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction Population size: 1579893 2020-05-28 03:45:06 [Main thread] INFO PatientLevelPrediction Cases: 34700 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction testSplit: time 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction outcomeCount: 34700 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction plpData class: plpData 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction testfraction: 0.2 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction nfold class: integer 2020-05-28 03:45:06 [Main thread] DEBUG PatientLevelPrediction nfold: 3 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction Patient-Level Prediction Package version 3.0.16 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction AnalysisID: Analysis_6 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction CohortID: 16317 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction OutcomeID: 7414 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction Cohort size: 2346252 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction Covariates: 10 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction Population size: 1579893 2020-05-28 03:45:08 [Main thread] INFO PatientLevelPrediction Cases: 34700 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction testSplit: time 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction outcomeCount: 34700 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction plpData class: plpData 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction testfraction: 0.2 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction nfold class: integer 2020-05-28 03:45:08 [Main thread] DEBUG PatientLevelPrediction nfold: 3 This continues like this for all analyses...
Additional context When using simulated data the time splitter seems to work as intended.