ohdsi-studies / DeepLearningComparison

Investigating different deep learning approaches
5 stars 3 forks source link

Finalize cohorts #3

Closed egillax closed 1 year ago

egillax commented 1 year ago

The remaining issue was to check if COVID is affecting the time-at-risk. The lung cancer and bipolar transition cohorts have shorter TARs so the index can be moved back so the TARs don't reach over the start of COVID. But dementia has a longer TAR so we need to check if this is affecting the outcome rate.

lhjohn commented 1 year ago

Results for IPCI

Dementia dementia Some artefact around the year 2016, where suddenly many more people experience the outcome. This can be explained by using "latest event" for the cohort entry, where the majority of people have their entry event at the end of 2015 and go on to experience the outcome in the beginning of 2016. The ramp down in the end (year 2020) is somewhat expected because observation periods are ending for increasingly more people.

Lung cancer lungcancer Gradual increase and eventual decrease of event frequency to form parabel shape is expected for same reason as for the dementia cohort.

Script

library(dplyr)
library(lubridate)
library(plyr)
library(ggplot2)
library(PatientLevelPrediction)

plpDataDir <- "data dir"
popSettings <- "pop settings"
outcomeId <- "outcome id"

plpData <- loadPlpData(plpDataDir)
pop <- createStudyPopulation(plpData, outcomeId, popSettings) %>%
  dplyr::filter(outcomeCount == 1)

eventDate <- ymd(pop$cohortStartDate) + days(pop$daysToEvent)
eventDate <- floor_date(eventDate, "month")

data <- plyr::count(eventDate)

ggplot(data, aes(x=x, y=freq)) +
  geom_point() +
  geom_smooth(aes(x=x,y=freq), method='loess') +
  theme_light() +
  xlab("Year") +
  ylab("No. of events")
lhjohn commented 1 year ago

Shifted cohorts to end before 2020.