vdorie / bartCause

Causal Inference using Bayesian Additive Regression Trees
73 stars 10 forks source link

TMLE Errors #21

Open alexrai93 opened 1 month ago

alexrai93 commented 1 month ago

Setting method.resp = "tmle" results in the following error:

calculating TMLE adjustment
Error in getTMLEEstimates(fit$y[commonSup.sub], trt[commonSup.sub], weights,  : 
  object 'results.list' not found

In addition, trying posteriorOfTMLE = F results in:

Error in if (deriv > 0) { : missing value where TRUE/FALSE needed
Error in if (deriv > 0) { : missing value where TRUE/FALSE needed
Error in tmle(Y = y, A = z, W = matrix(0, length(y), 1L), Q = cbind(Q0W = mu.hat.0,  : 
  object 'IC.ATC' not found

Otherwise it works great, e.g. if using "p.weight" or "bart."

R 4.1. tmle = 2.0.1.1, Windows 11.

vdorie commented 1 month ago

Unfortunately it looks as if tmle didn't return and I'll need more information about the data/model to figure out why, since I can't easily reproduce the problem. It looks as if it is a problem with the tmle package itself. I modified the error message slightly in1ba4193 to hopefully return some useful information.

On Mon, Aug 5, 2024 at 6:54 PM alexrai93 @.***> wrote:

Setting method.resp = "tmle" results in the following error:

calculating TMLE adjustment Error in getTMLEEstimates(fit$y[commonSup.sub], trt[commonSup.sub], weights, : object 'results.list' not found

In addition, trying posteriorOfTMLE = F results in:

Error in if (deriv > 0) { : missing value where TRUE/FALSE needed Error in if (deriv > 0) { : missing value where TRUE/FALSE needed Error in tmle(Y = y, A = z, W = matrix(0, length(y), 1L), Q = cbind(Q0W = mu.hat.0, : object 'IC.ATC' not found

Otherwise it works great, e.g. if using "p.weight" or "bart."

R 4.1. tmle = 2.0.1.1, Windows 11.

— Reply to this email directly, view it on GitHub https://github.com/vdorie/bartCause/issues/21, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE5P4ADJE5JOHVZK6K2IZDZQAUEDAVCNFSM6AAAAABMBK32RGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ2DSNZWG4YTKNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

alexrai93 commented 1 month ago

Thanks for checking into it, this was the code and data I am using to test out the package:

confounders <- c("RetailPrice", "WeekDay")

mybart <-bartc(
  response = SalesData$Sales,
  treatment = SalesData$Promo,
  confounders = as.matrix(SalesData[, confounders]),
  method.rsp = "tmle",
  method.trt = "bart",
  commonSup.rule = "none",
  keepTrees = T,
  n.samples = 1000,
)
SalesData <- structure(list(TrsDate = 1:143, Sales = c(247.05, 278.99, 362.34, 
384.3, 477.63, 345.87, 274.5, 296.46, 334.89, 400.77, 521.55, 
351.36, 203.13, 197.64, 230.58, 263.52, 362.34, 334.89, 466.65, 
219.6, 247.05, 258.03, 290.97, 290.97, 527.04, 389.79, 334.89, 
241.56, 201.15, 299.97, 312.93, 329.4, 296.46, 274.5, 367.83, 
206.62, 247.05, 340.38, 461.16, 329.4, 285.48, 274.5, 225.09, 
170.19, 225.09, 340.38, 340.38, 186.66, 285.48, 340.38, 296.46, 
323.91, 428.22, 192.15, 334.89, 334.89, 269.01, 284.98, 351.36, 
400.77, 318.42, 219.6, 263.52, 257.05, 323.91, 312.93, 488.61, 
271.51, 329.4, 263.52, 269.01, 318.42, 274.5, 307.44, 378.81, 
318.42, 269.01, 251.04, 258.03, 323.91, 422.73, 296.46, 263.52, 
285.48, 334.89, 356.85, 356.85, 505.08, 312.93, 266.26, 279.99, 
274.5, 345.87, 323.91, 378.7, 384.3, 307.44, 285.48, 285.48, 
345.87, 329.4, 502.08, 389.79, 312.93, 230.58, 306.94, 312.93, 
285.48, 285.48, 345.37, 312.93, 258.03, 236.07, 236.07, 263.52, 
285.48, 299.46, 76.86, 98.82, 263.87, 424.55, 474.45, 524.35, 
454.49, 374.65, 451.48, 458.98, 405.49, 510.95, 428.94, 447.45, 
403.45, 251.05, 274.5, 323.91, 340.38, 296.46, 323.91, 269.01, 
258.03, 274.5, 181.17, 219.6), RetailPrice = c(5.48, 5.6, 5.58, 
5.56, 5.64, 5.62, 5.45, 5.23, 5.36, 5.28, 5.49, 5.47, 5.52, 5.4, 
5.45, 5.56, 5.45, 5.43, 5.58, 5.52, 5.49, 5.56, 5.44, 5.56, 5.45, 
5.51, 5.41, 5.31, 5.71, 5.48, 5.59, 5.56, 5.46, 5.39, 5.55, 5.31, 
5.73, 5.37, 5.41, 5.41, 5.48, 5.5, 5.56, 5.32, 5.57, 5.55, 5.44, 
5.35, 5.43, 5.39, 5.5, 5.41, 5.41, 5.54, 5.4, 5.54, 5.34, 5.55, 
5.31, 5.37, 5.58, 5.52, 5.66, 5.53, 5.44, 5.61, 5.39, 5.49, 5.56, 
5.5, 5.38, 5.48, 5.64, 5.55, 5.39, 5.43, 5.64, 5.67, 5.23, 5.64, 
5.16, 5.58, 5.35, 5.63, 5.27, 5.49, 5.35, 5.64, 5.44, 5.45, 5.48, 
5.37, 5.53, 5.58, 5.51, 5.57, 5.47, 5.44, 5.36, 5.27, 5.48, 5.33, 
5.46, 5.42, 5.52, 5.41, 5.66, 5.44, 5.42, 5.55, 5.4, 5.62, 5.3, 
5.41, 5.42, 5.58, 5.61, 5.43, 5.43, 5.01, 4.97, 5.02, 5.03, 4.91, 
4.85, 3.95, 4.04, 4.13, 3.91, 4.04, 3.96, 3.97, 5.54, 5.55, 5.44, 
5.77, 5.58, 5.44, 5.47, 5.41, 5.34, 5.59, 5.38), WeekDay = c(5L, 
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 1L, 2L, 3L, 4L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L), Promo = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 
-143L), class = "data.frame")
vdorie commented 1 month ago

It looks as if the issue is coming from tmle, as I'm able to replicate with the example data by executing:

lm_fit <- lm(Sales ~ Promo + RetailPrice + WeekDay, data = SalesData)

tmle::tmle(
  Y = SalesData$Sales,
  A = SalesData$Promo,
  W = matrix(0, nrow(SalesData), 1),
  Q = cbind(
    predict(lm_fit, within(SalesData, Promo <- 0)),
    predict(lm_fit, within(SalesData, Promo <- 1))
  ),
  g1W = fitted(glm(Promo ~ RetailPrice + WeekDay, data = SalesData, family = binomial))
)

Unfortunately I'm not that familiar with how tmle is implemented, but it seems as if the oneStepATT function is failing because somehow its arguments get replaced with NAs.

alexrai93 commented 1 month ago

That was helpful, the optional g1W parameter in your code above shows my test data did not converge, best I can tell this result below gets passed to oneStepATT with NA values because b.ATT.rows is out of range, which triggers an error for NA in if(Deriv > 0) in the function.

g1W = g.ATT[b.ATT.rows]

If I remove g1W then tmle runs fine with the above - probably this is not an issue with real data, thanks again for checking into it.

alexrai93 commented 1 month ago

Confirmed, I changed a few values in the Promo variable so the glm converged - and that function above and the bartc both worked fine after. So just a convergence issue.