Open charhart opened 8 years ago
Sorry about that! This appears to be caused by weirdness in the Oracle JDBC driver. Clearly the right thing to do is to retrieve dates as timestamps, and it was silly of me to expect dates instead. I had to modify the DatabaseConnector
package to fix this problem.
Please update the DatabaseConnector package.
IMPORTANT: You have to manually remove these from your output folder (/home/chilton/ohdsi_keppra/run_19239):
After that, you can rerun (Yes, you can use createCohorts = FALSE
again).
P.s. note that the new version of DatabaseConnector now has a oracleDriver
argument for createConnectionDetails
, where you can specify "thin" or "oci". The default is "thin", so I'm pretty sure this has no consequences for you.
Thanks @schuemie. Removed those files and kicking off the job again. I'll let you know how it works out. It will be several hours before it gets to this point again.....
Looks like it got quite a bit farther, but it failed again. I can send you the full error logs as they are too big for a Github comment.
Maximum predicted log likelihood estimated at:
0.596099 (variance)
1.83171 (lambda)
Fitting model at optimal hyperparameter
Using prior: Laplace(1.83171) None
Using 4 thread(s)
Fitting outcome model took 1.46 mins
Fitting outcome model took 0.00321 secs
Fitting outcome model took 0.00319 secs
Fitting outcome model took 0.00305 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.181 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.176 secs
Fitting outcome model took 2.15 secs
Fitting outcome model took 0.00298 secs
Fitting outcome model took 0.003 secs
Fitting outcome model took 0.00327 secs
Fitting outcome model took 0.00304 secs
Fitting outcome model took 0.003 secs
Fitting outcome model took 0.0031 secs
Fitting outcome model took 0.00301 secs
Fitting outcome model took 0.00325 secs
Fitting outcome model took 0.00314 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.175 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.19 secs
Fitting outcome model took 3.63 secs
Fitting outcome model took 0.00297 secs
Fitting outcome model took 0.00325 secs
Fitting outcome model took 0.00318 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.162 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.18 secs
Fitting outcome model took 2.61 secs
Fitting outcome model took 0.00298 secs
Fitting outcome model took 0.0031 secs
Fitting outcome model took 0.00321 secs
Fitting outcome model took 0.00318 secs
Fitting outcome model took 0.00284 secs
Fitting outcome model took 0.0122 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.179 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.186 secs
Fitting outcome model took 3.04 secs
Fitting outcome model took 0.00316 secs
Fitting outcome model took 0.0032 secs
Fitting outcome model took 0.00324 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.178 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.189 secs
Fitting outcome model took 3.11 secs
Fitting outcome model took 0.00324 secs
Fitting outcome model took 0.00335 secs
Fitting outcome model took 0.00328 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.182 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.178 secs
Fitting outcome model took 3.47 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.182 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.181 secs
Fitting outcome model took 3.16 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.286 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.295 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.287 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.274 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.283 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.253 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.282 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.294 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.28 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.29 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.279 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.896 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.38 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.332 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.286 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.291 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.338 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.276 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.281 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.287 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.287 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.575 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.275 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.303 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.268 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.284 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.273 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.275 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.28 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.287 secs
Fitting outcome model took 0.00461 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.281 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.279 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.282 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.304 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.304 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.313 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.362 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.397 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.361 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.446 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.44 secs
Using prior: None
Using 4 thread(s)
Fitting outcome model took 0.405 secs
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, :
write error
Calls: execute ... ffbaseffdfindexget -> ffindexget -> clone -> clone.ff -> assign -> ff
In addition: Warning message:
replacing previous import ‘Rcpp::LdFlags’ by ‘RcppParallel::LdFlags’ when loading ‘Cyclops’
Execution halted
Well, good news is it is almost at the end!
Just checking: is there any disk space left? Can you check both the output folder and this location:
options("fftempdir")
I did a 'df' and it looks like there is still space. 'fftempdir' appears to just be a directory under /tmp, and it's writable and appears to have space.
Not sure why you're getting a 'write error'. Could you close and re-open R, and start again?
(Don't worry, it will jump straight to where it stopped because it will see all the files that have already been generated)
Alright - it's off and running again.
Alright, it seemed to work that time. Some warnings, but I think we actually finished!
I didn't count the total hours, but I think Regenstrief probably takes the record. :)
Maximum predicted log likelihood estimated at:
0.046145 (variance)
6.58344 (lambda)
Fitting model at optimal hyperparameter
Using prior: Laplace(6.58344) None
Using 4 thread(s)
Fitting outcome model took 6.25 mins
Packaging results in export folder for sharing
Connecting using Oracle driver
- using THIN to connect
Computing covariate balance took 17.5 secs
Computing covariate balance took 19.6 secs
Creating archive /home/chilton/ohdsi_keppra/run_19239/export/StudyResults.zip
- adding /home/chilton/ohdsi_keppra/run_19239/export/MetaData.txt
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsModel.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/Balance1On1Matching.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/BalanceVarRatioMatching.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsAfterVarRatioMatching.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/MainResults.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsPrefScale.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/Attrition1On1Matching.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/KaplanMeierPerProtocol.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/AttritionVarRatioMatching.csv
- adding /home/chilton/ohdsi_keppra/run_19239/export/Ps.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsAfter1On1Matching.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/KaplanMeierIntentToTreat.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsAfterVarRatioMatchingPrefScale.png
- adding /home/chilton/ohdsi_keppra/run_19239/export/PsAfter1On1MatchingPrefScale.png
Study results are ready for sharing at: /home/chilton/ohdsi_keppra/run_19239/export/StudyResults.zip
Warning messages:
1: replacing previous import ‘Rcpp::LdFlags’ by ‘RcppParallel::LdFlags’ when loading ‘Cyclops’
2: In CohortMethod::plotKaplanMeier(strata, includeZero = FALSE, fileName = file.path(exportFolder, :
The population has strata, but the stratification is not visible in the plot
3: In CohortMethod::plotKaplanMeier(strata, includeZero = FALSE, fileName = file.path(exportFolder, :
The population has strata, but the stratification is not visible in the plot
Thanks for your help!
Thanks for you patience!
My hypothesis is that the last problem was caused by too many file handles being open. I'll explicitly close all temp files in a new version (I thought R was taking care of that).
Hi, I seem to be having the same issue, but I am running the study on sql server. I have the following output when I try to run the study. I am using R version 3.3.0 on Windows through RStudio. Any thoughts?
Creating exposure and outcome cohorts
Connecting using SQL Server driver
- Creating treatment cohort
|================================================================================================================| 100%
Analysis took 7.08 secs
- Creating comparator cohort
|================================================================================================================| 100%
Analysis took 5.21 secs
- Creating angioedema cohort
|================================================================================================================| 100%
Analysis took 0.877 secs
- Creating negative control outcome cohort
|================================================================================================================| 100%
Analysis took 0.462 secs
Cohort counts:
cohortDefinitionId outcomeName count
...
...
...
...
Running analyses
*** Creating cohortMethodData objects ***
Loading required package: CohortMethod
Loading required package: Cyclops
Loading required package: FeatureExtraction
Connecting using SQL Server driver
Constructing treatment and comparator cohorts
|================================================================================================================| 100%
Analysis took 0.47 secs
Fetching cohorts from server
Fetching cohorts took 0.158 secs
Constructing default covariates
|================================================================================================================| 100%
Analysis took 49.5 secs
Done
Fetching data from server
Fetching data took 5.16 secs
Removing redundant covariates
Removing redundant covariates took 5.74 secs
Normalizing covariates
Fetching outcomes from server
Fetching outcomes took 0.266 secs
*** Creating study populations ***
|================================================================================================================| 100%
*** Fitting shared propensity score models ***
Removing subject that are in both cohorts (if any)
No outcome specified so skipping removing people with prior outcomes
Removing subjects with less than 1 day(s) at risk (if any)
No outcome specified so not creating outcome and time variables
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, :
vmode 'character' not implemented
Can you tell me whether any cohorts were created (based on the cohort counts you removed from your output above)?
Also, could you try this:
library(CohortMethod)
cmData <- loadCohortMethodData(file.path(outputFolder, "cmOutput", "CmData_l1_t1_c2"))
str(cmData$cohorts$cohortStartDate)
(where outputFolder
is the output folder used for the execute
function) and confirm that the data type is Date
?
Yes, the cohorts were created. Sorry, I should have mentioned that I took that out of the output.
I tried the snippet above and unfortunately the data type is chr.
Could you share one or two of the values? I"m trying to figure out why R doesn't consider them of type Date
.
Yes, The values look like this:
"2005-05-18" "2014-08-18" "2012-04-22"
Hmmm, those look suspiciously like dates ;-)
Could you try this
x <- as.Date(cmData$cohorts$cohortStartDate)
to see if there are values that are not valid dates?
Yes, I'm not sure what's going on.
When I run the above code, I do get dates. The output looks like the following:
> str(x)
Date[1:5182], format: "2005-05-18" "2014-08-18" "2012-04-22" ...
We are also having the same issue:
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, :
write error
Calls: execute ... do.call -> as.ff -> as.ff.default -> clone.ff -> assign -> ff
Execution halted
We are running all the latest versions of the OHDSI stack/Study package as of 5 days ago. On PostgreSQL and CentOS environment.
@jmbanda , doesn't quite look like the same:
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, :
vmode 'character' not implemented
compared to
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, :
write error
I'm not very familiar with the ff package, however write error sounds like maybe a write permission where the data is being stored?
@chrisknoll, ohh good catch. My apologies for spamming this thread. Indeed it was a different error. I solved it by providing an explicit path for the fftempdir
options(fftempdir = "/home/jmbanda/OHDSI/temp")
No more issues after that. Thanks for the response.
@aperotte, I'm still trying to figure out why the type is chr
and not Date
. Could you try this:
conn <- connect(connectionDetails)
x <- querySql(conn, "SELECT TOP 10 * FROM observation_period")
str(x$OBSERVATION_PERIOD_START_DATE)
(where connectionDetails
is the object you also used to call the execute
function). Please let me know if the result type is Date
.
Oh, almost forgot. You probablyneed to fully qualify your database and schema:
x <- querySql(conn, "SELECT TOP 10 * FROM cdm_data.dbo.observation_period")
where cdm_data.dbo
is the database schema holding your CDM.
@schuemie, it seems like the type returned is chr from this as well.
The output looks like this:
chr [1:10] "1999-12-18" "2010-04-18" "2000-09-01" ...
Sorry to make you jump through all these hoops, but could you try this code:
query <- "SELECT TOP 10 observation_period_start_date FROM observation_period"
connection <- connect(connectionDetails)
type_forward_only <- rJava::.jfield("java/sql/ResultSet", "I", "TYPE_FORWARD_ONLY")
concur_read_only <- rJava::.jfield("java/sql/ResultSet", "I", "CONCUR_READ_ONLY")
s <- rJava::.jcall(connection@jc, "Ljava/sql/Statement;", "createStatement", type_forward_only, concur_read_only)
r <- rJava::.jcall(s, "Ljava/sql/ResultSet;", "executeQuery", as.character(query)[1])
md <- rJava::.jcall(r, "Ljava/sql/ResultSetMetaData;", "getMetaData", check = FALSE)
resultSet <- new("JDBCResult", jr = r, md = md, stat = s, pull = rJava::.jnull())
rJava::.jcall(resultSet@md, "I", "getColumnType", as.integer(1))
# Please let me know which number you get here
RJDBC::dbClearResult(resultSet)
rJava::.jfield("java/sql/Types", "I", "DATE")
# Please also let me know this number
querySql(connection, "SELECT DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'observation_period' AND COLUMN_NAME = 'observation_period_start_date'")
# Please let me know which data type you get here
dbDisconnect(connection)
This should tell me what data type is on the server, and what type the JDBC driver says it is getting.
No problem! Thanks for all the help.
For rJava::.jcall(resultSet@md, "I", "getColumnType", as.integer(1))
, I get 12.
For rJava::.jfield("java/sql/Types", "I", "DATE")
, I get 91.
and For querySql(connection, "SELECT DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'observation_period' AND COLUMN_NAME = 'observation_period_start_date'")
, I get date.
Ok, I think I have figured out this problem: Just like you I'm using SQL Server, but I'm using integrated security. I'm assuming you're specifying your user name and password, and under the hood that means DatabaseConnector is using a different driver. This driver (JTDS) appears to have this known problem with dates.
I've modified DatabaseConnector to use the same driver with or without integrated security. The only case where this does not work is when your user account is in a different Windows domain than your server, and you have to specify the user domain when connecting to the server. If this is the case, the only current option is to use integrated security. If you don't have to specify the domain to connect, here's how it should now work:
Please update to the latest version of DatabaseConnector; Make sure you close all instances of R, open only one R instance, and run
library(devtools)
install_github("ohdsi/DatabaseConnector")
Next, make sure your study output folder is completely empty.
Then, rerun the study.
Great! That worked for us.
We did have one more hoop to jump through, however. Because I am using a machine that is on a different domain than the database, I had to use the runas windows command to launch Rstudio under a different domain/user. Then the sql server driver was used (not jTDS) and everything went through smoothly. The command looks like this:
runas /netonly /user:domain\username "C:\path\to\rstudio\bin\rstudio.exe"
with domain, username, and path\to\rstudio replaced with appropriate values.
Thanks! I've added your solution to the DatabaseConnector manual.
Great! Also, thanks to @mark-velez for the runas solution.
I'm still running keppra on Oracle, and ran into another bug.
Here's the error:
Full log:
Also, if I rerun it (after it's fixed), can I start from createCohorts = FALSE, or do I need to start over?
Thanks.