Closed nhejazi closed 7 years ago
update -- as of 5428884, we have the following errors:
using the above example, when method = "hazard"
:
R > fit <- survtmle(ftime = ftime, ftype = ftype,
... trt = A, adjustVars = W,
... glm.trt = "1",
... glm.ftime = "I(W1*W2) + trt + t",
... glm.ctime = "W1 + t", method = "hazard",
... t0 = t_0)
'speedglm' ran into an error...reverting to use of 'glm'.
Warning message:
In hazard_tmle(ftime = clean$ftime, ftype = clean$ftype, trt = clean$trt, :
TMLE fluctuations did not converge. Check that meanIC is adequately
small and proceed with caution.
...and when method = "mean"
:
R > fit <- survtmle(ftime = ftime, ftype = ftype,
... trt = A, adjustVars = W,
... glm.trt = "1",
... glm.ftime = "I(W1*W2) + trt + t",
... glm.ctime = "W1 + t", method = "mean",
... t0 = t_0)
'speedglm' ran into an error...reverting to use of 'glm'.
Error in model.frame.default(formula = stats::as.formula(reg_form), data = data, :
invalid type (closure) for variable 't'
This error arises from mis-treatment of the t
argument to estimateIteratedMean
The latter error is fixed by removing "t" from glm.ftime (iterated mean-based TMLE can't smooth over time).
The first one still runs, correct? Just throws a warning about lack of convergence in targeting step? What is fit$meanIC?
Ok, so I was just using the method incorrectly in the method = "mean"
case. Removing "t" from the formula corrects the breaking error, but speedglm
still fails within the tryCatch
in fast_glm
.
for the method = "hazard"
case
R > fit$meanIC
D.j1.z0 D.j1.z1
0.21780467 0.06539523
I'm not sure what to make of this? I think there might still be an underlying problem, since speedglm
fails under repeated simulations...
What happens if you increase sample size?
Tried it with up to n = 1000 (actually the output above is from that case). Error appears to be consistent.
And it does not error on the same example from a previous build that calls glm
directly (not through the new wrapper)?
Just tried it out, apparently this example also causes the same warning on the current stable version on master
(which calls glm
without the wrapper):
R > fit <- survtmle(ftime = ftime, ftype = ftype,
... trt = A, adjustVars = W,
... glm.trt = "1",
... glm.ftime = "I(W1*W2) + trt + t",
... glm.ctime = "W1 + t", method = "hazard",
... t0 = t_0)
Warning message:
In hazard_tmle(ftime = clean$ftime, ftype = clean$ftype, trt = clean$trt, :
TMLE fluctuations did not converge. Check that meanIC is adequately
small and proceed with caution.
R > fit
$est
[,1]
0 1 0.21806753
1 1 0.07448559
$var
0 1 1 1
0 1 0.0034125286 0.0001608466
1 1 0.0001608466 0.0011460925
The example producing this differs slightly from the one used in the results above. Having tried it, the exact code that appears directly above produces the same exact output whether using the master
or speedglm
builds of the package.
...and here's the version for method = "mean"
(using the speedglm
build):
R > fit <- survtmle(ftime = ftime, ftype = ftype,
... trt = A, adjustVars = W,
... glm.trt = "1",
... glm.ftime = "I(W1*W2) + trt",
... glm.ctime = "W1 + t", method = "mean",
... t0 = t_0)
R > fit
$est
[,1]
0 1 0.17298737
1 1 0.05878642
$var
0 1 1 1
0 1 3.000291e-03 7.381232e-06
1 1 7.381232e-06 1.082092e-03
...and using GitHub master
:
* DONE (survtmle)
Reloading installed survtmle
survtmle: Targeted Learning for Survival Analysis
Version: 1.0.0
R > library(survtmle)
R > fit <- survtmle(ftime = ftime, ftype = ftype,
... trt = A, adjustVars = W,
... glm.trt = "1",
... glm.ftime = "I(W1*W2) + trt",
... glm.ctime = "W1 + t", method = "mean",
... t0 = t_0)
R > fit
$est
[,1]
0 1 0.17298737
1 1 0.05878642
$var
0 1 1 1
0 1 3.000291e-03 7.381232e-06
1 1 7.381232e-06 1.082092e-03
Note that the error is now gone and the wrapper code for speedglm
appears to "just run." No idea wtf is happening...
...maybe it's the data?
R > dat <- as_tibble(cbind(ftime,ftype,A,W))
R > dat
# A tibble: 100 x 5
ftime ftype A W1 W2
<dbl> <dbl> <int> <dbl> <int>
1 173 1 1 0.26895781 0
2 40 1 0 0.02268431 1
3 38 1 0 0.36009201 0
4 44 0 1 0.64454120 0
5 15 0 0 0.44678829 0
6 17 0 1 0.53298659 1
7 148 1 1 0.64690471 1
8 7 1 1 0.21324807 0
9 26 1 1 0.78985955 1
10 70 0 1 0.30236718 0
# ... with 90 more rows
R > skimr::skim(dat)
Numeric Variables
# A tibble: 5 x 13
var type missing complete n mean sd min
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A integer 0 100 100 0.5200000 0.5021167 0.000000000
2 ftime numeric 0 100 100 66.6200000 78.9049780 1.000000000
3 ftype numeric 0 100 100 0.7500000 0.4351941 0.000000000
4 W1 numeric 0 100 100 0.4509021 0.2788190 0.002459463
5 W2 integer 0 100 100 0.4700000 0.5016136 0.000000000
# ... with 5 more variables: `25% quantile` <dbl>, median <dbl>, `75%
# quantile` <dbl>, max <dbl>, hist <chr>
I'm looking into it.
Yeah, something is broken, but I can't tell what yet.
Did these warnings crop up when we compiled the vignette for CRAN submission? Just trying to understand when they were introduced so I know where to start looking. I don't recall such warnings popping up immediately before CRAN submission.
These definitely didn't occur during the CRAN submission process. The errors/warnings arose after ace4c0a and are upper bounded by the latest commit on branch speedglm
. In case you haven't used it before, this is a perfect time to try out git bisect
(apologies if you're already familiar with this). I'll give it a go on my end in the morning.
As of ad656bc, there has been a substantial degree of tweaking of the core code base to allow speedglm
to work properly. Still, a few errors remain
* DONE
Status: 2 ERRORs
R CMD check results
2 errors | 0 warnings | 0 notes
checking examples ... ERROR
Running examples in ‘survtmle-Ex.R’ failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: mean_tmle
> ### Title: TMLE for G-Computation of Cumulative Incidence
> ### Aliases: mean_tmle
>
> ### ** Examples
... 14 lines ...
> fit1 <- mean_tmle(ftime = ftime, ftype = ftype,
+ trt = trt, adjustVars = adjustVars,
+ glm.trt = "W1 + W2",
+ glm.ftime = "trt + W1 + W2",
+ glm.ctime = "trt + W1 + W2")
Warning in rep(no, length.out = length(ans)) :
'x' is NULL so the result will be NULL
Error in ans[!test & ok] <- rep(no, length.out = length(ans))[!test & :
replacement has length zero
Calls: mean_tmle -> estimateTreatment -> ifelse
Execution halted
checking tests ... ERROR
Running ‘testthat.R’ [107s/142s]
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
ftypeOfInterest = ftypeOfInterest, trtOfInterest = trtOfInterest, bounds = bounds,
verbose = verbose, tol = tol, Gcomp = Gcomp, method = method)
7: paste0("trt + ", paste0("I(t==", seq_len(max(ftime)), ")", collapse = "+"), "+",
paste0("I(trt*t==", seq_len(max(ftime)), ")", collapse = "+"))
8: paste0("I(t==", seq_len(max(ftime)), ")", collapse = "+")
testthat results ================================================================
OK: 45 SKIPPED: 0 FAILED: 4
1. Error: hazard_tmle with no censoring works as expected (@test-hazard_tmle.R#47)
2. Error: hazard_tmle with glm and super learner with only glm give same answers (two failure types) (@test-hazard_tmle.R#197)
3. Error: hazard_tmle and mean_tmle timepoints equal Kaplan-Meier without covariates (@test-timepoints.R#34)
4. Error: hazard_tmle and mean_tmle equal aalen-johansen with no covariates (@test-timepoints.R#85)
Error: testthat unit tests failed
Execution halted
After resolving an error introduced in master
(probably best to avoid re-organizing the code base needlessly in the future), the remaining diffs between master
and speedglm
should all be contributing to getting the wrapper fast_glm
working...
The remaining errors produced by devtools::check()
are as follows:
R CMD check results
2 errors | 0 warnings | 0 notes
checking examples ... ERROR
Running examples in ‘survtmle-Ex.R’ failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: mean_tmle
> ### Title: TMLE for G-Computation of Cumulative Incidence
> ### Aliases: mean_tmle
>
> ### ** Examples
... 14 lines ...
> fit1 <- mean_tmle(ftime = ftime, ftype = ftype,
+ trt = trt, adjustVars = adjustVars,
+ glm.trt = "W1 + W2",
+ glm.ftime = "trt + W1 + W2",
+ glm.ctime = "trt + W1 + W2")
Warning in rep(no, length.out = length(ans)) :
'x' is NULL so the result will be NULL
Error in ans[!test & ok] <- rep(no, length.out = length(ans))[!test & :
replacement has length zero
Calls: mean_tmle -> estimateTreatment -> ifelse
Execution halted
checking tests ... ERROR
Running ‘testthat.R’ [23s/39s]
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
ftypeOfInterest = ftypeOfInterest, trtOfInterest = trtOfInterest, bounds = bounds,
verbose = verbose, tol = tol, Gcomp = Gcomp, method = method)
7: paste0("trt + ", paste0("I(t==", seq_len(max(ftime)), ")", collapse = "+"), "+",
paste0("I(trt*t==", seq_len(max(ftime)), ")", collapse = "+"))
8: paste0("I(t==", seq_len(max(ftime)), ")", collapse = "+")
testthat results ================================================================
OK: 38 SKIPPED: 0 FAILED: 4
1. Error: hazard_tmle with no censoring works as expected (@test-hazard_tmle.R#47)
2. Error: hazard_tmle with glm and super learner with only glm give same answers (two failure types) (@test-hazard_tmle.R#197)
3. Error: hazard_tmle and mean_tmle timepoints equal Kaplan-Meier without covariates (@test-timepoints.R#34)
4. Error: hazard_tmle and mean_tmle equal aalen-johansen with no covariates (@test-timepoints.R#85)
Error: testthat unit tests failed
Execution halted
76a1a767dc9d7afcdc172a4540719e2409c1d0cb seems to fix the timepoints
bug
Indeed, it seems the fitted model objects were getting passed to speedglm
in the do.call
part of timepoints
-- I had implemented a similar fix in a commit c89f859 that remained un-pushed before your fix in 76a1a76.
Our various fixes have been merged as of 4185d1a. All checks and tests on branch speedglm
are now passing.
Closing this since use of speedglm
is now functional on the eponymous branch.
631c0d8 implemented fixes in the wrapper function
fast_glm
, which now allowsspeedglm
to be invoked in the variousestimate...
functions as well as influctuateIteratedMean
.Unfortunately, this causes an error in
timepoints
-- in particular,outList[[ct]] <- do.call("survtmle", args = funOpts)
[line 112 in branchspeedglm
] contributes to the failure.The exact version of the package I've been using is here: survtmle_1.0.1.tar.gz -- probably best to
install.packages(..., type = "source")
to reproduce this error.Example taken from
README
.and
sessionInfo()
:@benkeser, I think you might be able to come up with thoughts on where this bug is originating since you're a bit more familiar with the core code base -- any thoughts?