Open mvankessel-EMC opened 1 month ago
Here are some toy examples with various logRr
and seLogRr
inputs:
logRr <- c(1, 2, 3)
seLogRr <- c(4, 5, 6)
EmpiricalCalibration::fitMcmcNull(logRr, seLogRr)
#> Estimated null distribution (using MCMC)
#>
#> Estimate lower .95 upper .95
#> Mean 1.491929 1.491929 1.4919
#> Precision 1.475598 0.017284 24.2599
#>
#> Acceptance rate: 0.859931400685993
logRr <- c(1, 2, NULL)
seLogRr <- c(4, 5, NULL)
EmpiricalCalibration::fitMcmcNull(logRr, seLogRr)
#> Estimated null distribution (using MCMC)
#>
#> Estimate lower .95 upper .95
#> Mean 0.942659 0.942659 0.9427
#> Precision 2.661310 0.011013 35.4243
#>
#> Acceptance rate: 0.91490085099149
logRr <- c(1, 2, NA)
seLogRr <- c(4, 5, NA)
EmpiricalCalibration::fitMcmcNull(logRr, seLogRr)
#> Warning in EmpiricalCalibration::fitMcmcNull(logRr, seLogRr): Estimate(s) with
#> NA standard error detected. Removing before fitting null distribution
#> Estimated null distribution (using MCMC)
#>
#> Estimate lower .95 upper .95
#> Mean 0.9426589 0.9426589 0.9427
#> Precision 0.6259526 0.0067681 11.5137
#>
#> Acceptance rate: 0.871571284287157
logRr <- c(NULL, NULL, NULL)
seLogRr <- c(NULL, NULL, NULL)
EmpiricalCalibration::fitMcmcNull(logRr, seLogRr)
#> Error in eval(expr, envir, enclos): Not compatible with requested type: [type=NULL; target=double].
logRr <- c(NA, NA, NA)
seLogRr <- c(NA, NA, NA)
EmpiricalCalibration::fitMcmcNull(logRr, seLogRr)
#> Warning in EmpiricalCalibration::fitMcmcNull(logRr, seLogRr): Estimate(s) with
#> NA standard error detected. Removing before fitting null distribution
#> Error in optim(c(0, 100), logLikelihoodNullMcmc, logRr = logRr, seLogRr = seLogRr): function cannot be evaluated at initial parameters
Created on 2024-08-21 with reprex v2.1.1
Hmmm, I guess if there are no valid estimates at all, the fitted distribution should have NA parameters, and calibration should return NA. I'll add this as a special case and make sure it gets handled correctly.
Where do you see NULL estimates?
This should now be handled in this commit: https://github.com/OHDSI/EmpiricalCalibration/commit/37e237cd1385321647ba1e5e588e9f6c581b4235
The real question is: why did you have NA estimates for all your negative controls?
So sometimes the logRr
and seLogRr
would be vectors of numeric values, but sometimes would be vectors of NA
. Could this be due to no records existing for all the negative controls? Would you typically handle this by explicitly removing negative controls that have 0 counts?
So I tried the above. There are values in both logRr
and seLogRr
now. It crashes when EmpiricalCalibration:::logLikelihoodNullMcmc()
returns Inf
. See the repex:
logRr <- c(
-0.040514268, 0.481692514, 0.036482466, -0.095145701, 0.314433784,
-1.497334478, 0.183195428, 0.289136733, 0.402694353, 0.367573632,
0.463256874, -0.054422481, -0.060860638, 0.013167817, -18.000212194,
1.052980223, -0.008283057, -0.031046398, -1.321463935, 0.559708684,
0.768082068, 0.116301485, 0.164942591, 0.746996019
)
seLogRr <- c(
0.1483240473, 0.1497285553, 0.1020657210, 0.1609180653, 0.0937424754,
0.7320934116, 0.1442173789, 0.0959738294, 0.1462481972, 0.3990881643,
0.3010253775, 0.4164489931, 0.2158912359, 0.0003913046, 0.0000000000,
0.6348588999, 0.5327961873, 0.3360717777, 1.1168928052, 0.2971940110,
0.7763132355, 0.6266069969, 0.2506407727, 1.3517969687
)
optim(c(0, 100), EmpiricalCalibration:::logLikelihoodNullMcmc, logRr = logRr, seLogRr = seLogRr)
#> Error in optim(c(0, 100), EmpiricalCalibration:::logLikelihoodNullMcmc, : function cannot be evaluated at initial parameters
EmpiricalCalibration:::logLikelihoodNullMcmc(theta = c(0, 100), logRr = logRr, seLogRr = seLogRr)
#> [1] Inf
If I remove the value 0 from the seLogRr
and the corresponding value from logRr
it seems to not return Inf
.
# Removing logRr 0 (index 15)
logRrNoZero <- logRr[c(1:14, 16:length(logRr))]
seLogRrNoZero <- seLogRr[c(1:14, 16:length(seLogRr))]
EmpiricalCalibration:::logLikelihoodNullMcmc(theta = c(0, 100), logRr = logRrNoZero, seLogRr = seLogRrNoZero)
#> [1] 28.44091
optim(c(0, 100), EmpiricalCalibration:::logLikelihoodNullMcmc, logRr = logRrNoZero, seLogRr = seLogRrNoZero)
#> $par
#> [1] 0.1692758 34.8485282
#>
#> $value
#> [1] 21.13314
#>
#> $counts
#> function gradient
#> 79 NA
#>
#> $convergence
#> [1] 0
#>
#> $message
#> NULL
Finally I checked the delta between each logRr
and seLogRr
value and the 15th value has a delta of ~18.
abs(logRr - seLogRr)
#> [1] 0.188838315 0.331963959 0.065583255 0.256063766 0.220691309 2.229427890 0.038978049
#> [8] 0.193162904 0.256446156 0.031514532 0.162231496 0.470871474 0.276751874 0.012776512
#> [15] 18.000212194 0.418121323 0.541079244 0.367118176 2.438356740 0.262514673 0.008231167
#> [22] 0.510305512 0.085698182 0.604800950
Created on 2024-08-26 with reprex v2.1.1
I can reproduce it in a toy example aswel:
logRr <- c(rnorm(9), -19)
seLogRr <- c(rnorm(9), 0)
theta <- c(0, 100)
optim(c(0, 100), EmpiricalCalibration:::logLikelihoodNullMcmc, logRr = logRr, seLogRr = seLogRr)
#> Error in optim(c(0, 100), EmpiricalCalibration:::logLikelihoodNullMcmc, : function cannot be evaluated at initial parameters
EmpiricalCalibration:::logLikelihoodNullMcmc(theta = c(0, 100), logRr = logRr, seLogRr = seLogRr)
#> [1] Inf
Created on 2024-08-26 with reprex v2.1.1
I'm out of my depth of what the implications are or how to solve it. But hopefully this is helpful.
This should now be handled in this commit: 37e237c
The real question is: why did you have NA estimates for all your negative controls?
Regarding this build, I run into a different error:
#> Error in quantile.default(dist, c(0.5, alpha/2, 1 - (alpha/2))) :
#> missing values and NaN's not allowed if 'na.rm' is FALSE
#> In addition: Warning messages:
#> 1: In EmpiricalCalibration::fitMcmcNull(logRr = ncs$logRr, seLogRr = ncs$seLogRr) :
#> Estimate(s) with NA standard error detected. Removing before fitting null distribution
#> 2: In EmpiricalCalibration::fitMcmcNull(logRr = ncs$logRr, seLogRr = ncs$seLogRr) :
#> No valid estimates left. Returning undefined null distribution
I'm assuming it is this quantile()
call, that also should have na.rm = TRUE
.
While running an SCCS analysis I run into the following error during the calibration of the estimates:
What seems to be the case is that the supplied
logRr
andSeLogRr
are both vectors containingNA
's. So they get all get removed here. Which results in a vector ofnumeric(0)
(length = 0) for bothlogRr
andSeLogRr
.If after removal should there be a check that asserts the length of both and if they are of length 0, set the value to
0
, or a vector of the original length of0
's? I'm not sure what 'standard' value you'd use in this case.I.e.: