Open shinseojeong opened 7 years ago
Hi @shinseojeong, thank you for the detailed feedback!
You can ignore the warning, it is not important and should be fixed in the latest version.
I hope I have fixed the error, by allowing outcome counts to be zero: https://github.com/OHDSI/StudyProtocols/commit/fc018335e659399be97f58c380861e25defeffb0
Could you reinstall the package and try again? You can of course skip the createCohorts and fetchAllDataFromServer again as you did here.
Thanks a lot, @schuemie ! I'll try again and let you know the results.
Hi @schuemie , I tried the code you fixed, but it didn't executed further at "50%" for about 4~5 days.
So, I stopped the session and let the code restarted without removing the directories and files made at first try. However it could't progressed further at "0%" this time for about 2 days.
Please check those pictures above for me... Thanks a lot.
Hi @shinseojeong,
Can you tell me what database platform you're using? Microsoft SQL Server?
Yes, I'm using Microsoft SQL Server database platform.
I'm not sure why it is so slow. The two steps it is performing (when the progressbar is visible) are
The first time you ran it step 1 of course didn't take any time, but the second step shouldn't have taken 4-5 days. It sounds like the database server is having problems, for example because it doesn't have enough temp space. Could you discuss with your database administrator?
OK, I'll discuss with our database administrator. Thank you for giving me your opinions regarding problems.
Also you can execute the command 'sp_who2' (if you have the correct privileges) and you should see the active sessions connected to the database. The important column to look at is the 'BlkBy' column to see if there seems to be some kind of resource block happening that is holding up the copy. If it's just copying from a temp table into another table, then I can't imagine that taking days. So, as @schuemie suggested: check with your administrator as the query is executing to see if anything looks out of the ordinary.
Hi, @schuemie I was able to finished the "injectSignals" step when I reran the code after deleting outcomes injected in a previous (aborted) run.
So I executed the next step "generateAllCohortMethodDataObjects" but during "Constructing cohortMethodData objects", I got an error message like below.
I confirmed ".vimplemented" and found that the implemente state of character is 'FALSE'.
I think some codes should be altered to let it recognized characters as factors. I referenced the web site 'http://stackoverflow.com/questions/21911721/character-vectors-as-ff-objects-in-r'.
Could you review the situation and fix some codes?
Thank you!
I don't think the problem is that characters isn't supported, but instead the problem is why characters are encountered in the first place. (Remember: the code has executed without problems at least in one environment).
The error message unfortunately isn't very helpful, so I'll have to ask you to debug. Could you first type
debug(constructCohortMethodDataObject)
and then rerun generateAllCohortMethodDataObjects
? That should allow you to step through the code until the error occurs. It would be good to now the exact line where things go wrong.
I tried to debugging and the code step through until the error below.
Right, that doesn't really help us...
Next plan: could you run the code below? It will create two new functions that are basically the functions in the package, but then with lots of debugging output. After running that code, you can just call generateAllCohortMethodDataObjectsDebug(workFolder)
to run the function, and copy-paste any output to me.
constructCohortMethodDataObjectDebug <- function(targetId,
comparatorId,
targetConceptId,
comparatorConceptId,
workFolder) {
# Subsetting cohorts
ffbase::load.ffdf(dir = file.path(workFolder, "allCohorts"))
ff::open.ffdf(cohorts, readonly = TRUE)
writeLines(paste0("nrow(cohorts) = ", nrow(cohorts)))
idx <- cohorts$cohortDefinitionId == targetId | cohorts$cohortDefinitionId == comparatorId
cohorts <- ff::as.ram(cohorts[ffbase::ffwhich(idx, idx == TRUE), ])
writeLines(paste0("After filtering: nrow(cohorts) = ", nrow(cohorts)))
cohorts$treatment <- 0
cohorts$treatment[cohorts$cohortDefinitionId == targetId] <- 1
cohorts$cohortDefinitionId <- NULL
treatedPersons <- length(unique(cohorts$subjectId[cohorts$treatment == 1]))
comparatorPersons <- length(unique(cohorts$subjectId[cohorts$treatment == 0]))
treatedExposures <- length(cohorts$subjectId[cohorts$treatment == 1])
comparatorExposures <- length(cohorts$subjectId[cohorts$treatment == 0])
counts <- data.frame(description = "Starting cohorts",
treatedPersons = treatedPersons,
comparatorPersons = comparatorPersons,
treatedExposures = treatedExposures,
comparatorExposures = comparatorExposures)
metaData <- list(targetId = targetId,
comparatorId = comparatorId,
attrition = counts)
attr(cohorts, "metaData") <- metaData
# Subsetting outcomes
ffbase::load.ffdf(dir = file.path(workFolder, "allOutcomes"))
ff::open.ffdf(outcomes, readonly = TRUE)
writeLines(paste0("nrow(outcomes) = ", nrow(outcomes)))
idx <- !is.na(ffbase::ffmatch(outcomes$rowId, ff::as.ff(cohorts$rowId)))
if (ffbase::any.ff(idx)){
outcomes <- ff::as.ram(outcomes[ffbase::ffwhich(idx, idx == TRUE), ])
} else {
outcomes <- as.data.frame(outcomes[1, ])
outcomes <- outcomes[T == F,]
}
# Add injected outcomes
ffbase::load.ffdf(dir = file.path(workFolder, "injectedOutcomes"))
ff::open.ffdf(injectedOutcomes, readonly = TRUE)
writeLines(paste0("nrow(injectedOutcomes) = ", nrow(injectedOutcomes)))
injectionSummary <- read.csv(file.path(workFolder, "signalInjectionSummary.csv"))
injectionSummary <- injectionSummary[injectionSummary$exposureId %in% c(targetConceptId, comparatorConceptId), ]
idx1 <- ffbase::'%in%'(injectedOutcomes$subjectId, cohorts$subjectId)
idx2 <- ffbase::'%in%'(injectedOutcomes$cohortDefinitionId, injectionSummary$newOutcomeId)
idx <- idx1 & idx2
if (ffbase::any.ff(idx)){
injectedOutcomes <- ff::as.ram(injectedOutcomes[idx, ])
colnames(injectedOutcomes)[colnames(injectedOutcomes) == "cohortStartDate"] <- "eventDate"
colnames(injectedOutcomes)[colnames(injectedOutcomes) == "cohortDefinitionId"] <- "outcomeId"
injectedOutcomes <- merge(cohorts[, c("rowId", "subjectId", "cohortStartDate")], injectedOutcomes[, c("subjectId", "outcomeId", "eventDate")])
injectedOutcomes$daysToEvent = injectedOutcomes$eventDate - injectedOutcomes$cohortStartDate
#any(injectedOutcomes$daysToEvent < 0)
#min(outcomes$daysToEvent[outcomes$outcomeId == 73008])
outcomes <- rbind(outcomes, injectedOutcomes[, c("rowId", "outcomeId", "daysToEvent")])
}
metaData <- data.frame(outcomeIds = unique(outcomes$outcomeId))
attr(outcomes, "metaData") <- metaData
# Subsetting covariates
covariateData <- FeatureExtraction::loadCovariateData(file.path(workFolder, "allCovariates"))
writeLines(paste0("names(cohorts) = ", names(cohorts)))
writeLines(paste0("ff::vmode(cohorts$rowId) = ", ff::vmode(cohorts$rowId)))
idx <- is.na(ffbase::ffmatch(covariateData$covariates$rowId, ff::as.ff(cohorts$rowId)))
covariates <- covariateData$covariates[ffbase::ffwhich(idx, idx == FALSE), ]
# Filtering covariates
filterConcepts <- readRDS(file.path(workFolder, "filterConceps.rds"))
filterConcepts <- filterConcepts[filterConcepts$exposureId %in% c(targetId, comparatorId),]
filterConceptIds <- unique(filterConcepts$filterConceptId)
writeLines(paste0("length(filterConceptIds) = ", length(filterConceptIds)))
writeLines(paste0("class(filterConceptIds) = ", class(filterConceptIds)))
idx <- is.na(ffbase::ffmatch(covariateData$covariateRef$conceptId, ff::as.ff(filterConceptIds)))
covariateRef <- covariateData$covariateRef[ffbase::ffwhich(idx, idx == TRUE), ]
filterCovariateIds <- covariateData$covariateRef$covariateId[ffbase::ffwhich(idx, idx == FALSE), ]
idx <- is.na(ffbase::ffmatch(covariates$covariateId, filterCovariateIds))
covariates <- covariates[ffbase::ffwhich(idx, idx == TRUE), ]
result <- list(cohorts = cohorts,
outcomes = outcomes,
covariates = covariates,
covariateRef = covariateRef,
metaData = covariateData$metaData)
class(result) <- "cohortMethodData"
return(result)
}
generateAllCohortMethodDataObjectsDebug <- function(workFolder) {
writeLines("Constructing cohortMethodData objects")
start <- Sys.time()
exposureSummary <- read.csv(file.path(workFolder, "exposureSummaryFilteredBySize.csv"))
# pb <- txtProgressBar(style = 3)
for (i in 1:nrow(exposureSummary)) {
targetId <- exposureSummary$tprimeCohortDefinitionId[i]
comparatorId <- exposureSummary$cprimeCohortDefinitionId[i]
targetConceptId <- exposureSummary$tCohortDefinitionId[i]
comparatorConceptId <- exposureSummary$cCohortDefinitionId[i]
folderName <- file.path(workFolder, "cmOutput", paste0("CmData_l1_t", targetId, "_c", comparatorId))
writeLines(paste0("Generating folder ", folderName))
# if (!file.exists(folderName)) {
cmData <- constructCohortMethodDataObjectDebug(targetId = targetId,
comparatorId = comparatorId,
targetConceptId = targetConceptId,
comparatorConceptId = comparatorConceptId,
workFolder = workFolder)
# CohortMethod::saveCohortMethodData(cmData, folderName)
# }
# setTxtProgressBar(pb, i/nrow(exposureSummary))
}
# close(pb)
delta <- Sys.time() - start
writeLines(paste("Generating all CohortMethodData objects took", signif(delta, 3), attr(delta, "units")))
}
Hello, I'm Seojeong, a graduate student of Ajou university in Korea. I got a issue while I running the LargeScalePopEst Code.
I was trying to execute "injectionSignals" parameter especially for comfirming whether each step working rightly and then an error message was shown just like below. (Note that the parameters "createCohorts" and "fetchAllDataFromServer" worked well although one warning message poped up during executing the "createCohorts" step.)
↓ The warning message during "createCohort" step.
↓ The Error message during "injectSignals" step.
And I figured out that the length of [ unique(data$outcomeId) ] and [ negativeControlIds ] is different each other.
So I ask if it is possible that the code of LargeScalePopEst protocols can be altered.
↓ The Create outcomes file CODE
I think it would be better that the code should allow or skip the absent of some "outcomeId". Could you modify this Code for us?
Please review this issue.
Thank you!