Closed ymzhang-neo closed 3 years ago
Here is the code for my example:
set.seed(1)
library(bartCause)
## fit a simple linear model
n <- 100L
beta.z <- c(.75, -0.5, 0.25)
beta.y <- c(.5, 1.0, -1.5)
sigma <- 2
# set.seed(725)
x <- matrix(rnorm(3 * n), n, 3)
tau <- rgamma(1L, 0.25 * 16 * rgamma(1L, 1 * 32, 32), 16)
p.score <- pnorm(x %*% beta.z)
z <- rbinom(n, 1, p.score)
mu.0 <- x %*% beta.y
mu.1 <- x %*% beta.y + tau
y <- mu.0 * (1 - z) + mu.1 * z + rnorm(n, 0, sigma)
# low parameters only for example
fit <- bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L)
summary(fit)
all.equal(fit$seed, .Random.seed) # These two are identical
Here are the two chunks of output from summary(fit)
. I run the same code in different R sessions, with the same R version (session info included in the end) on the same device.
Output from Round 1:
Call: bartc(response = y, treatment = z, confounders = x, n.samples = 100L,
n.burn = 15L, n.chains = 2L)
Causal inference model fit by:
model.rsp: bart
model.trt: bart
Treatment effect (pate):
estimate sd ci.lower ci.upper
ate -0.2827 0.5957 -1.45 0.8848
Estimates fit from 100 total observations
95% credible interval calculated by: normal approximation
population TE approximated by: posterior predictive distribution
Result based on 100 posterior samples times 2 chains
Session Info from Round 1:
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bartCause_1.0-4
loaded via a namespace (and not attached):
[1] compiler_4.0.5 parallel_4.0.5 tools_4.0.5 dbarts_0.9-19
Output from Round 2:
Call: bartc(response = y, treatment = z, confounders = x, n.samples = 100L,
n.burn = 15L, n.chains = 2L)
Causal inference model fit by:
model.rsp: bart
model.trt: bart
Treatment effect (pate):
estimate sd ci.lower ci.upper
ate -0.3017 0.6115 -1.5 0.8969
Estimates fit from 100 total observations
95% credible interval calculated by: normal approximation
population TE approximated by: posterior predictive distribution
Result based on 100 posterior samples times 2 chains
Session Info from Round 2 (same from Round 1, but I am including it for completeness):
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bartCause_1.0-4
loaded via a namespace (and not attached):
[1] compiler_4.0.5 parallel_4.0.5 tools_4.0.5 dbarts_0.9-19
Thanks for reporting this, and for providing a reproducible example! I'll look into how the seeds are set, but the main issue is with parallelization and the fact that each thread receives its own pRNG. If you want reproducible results before then, you can run with n.threads = 1L
. In the context of the example:
set.seed(1)
fit.0 <- bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L, n.threads = 1L)
ate.0 <- summary(fit1)$estimates$estimate[1]
for (i in 1:1000) {
set.seed(1)
fit <- bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L, n.threads = 1L, verbose = FALSE)
stopifnot(summary(fit)$estimates$estimate[1] == ate.0)
}
The seed that is saved with the object is primarily used to generate posterior predictive samples in a reproducible way.
Added in 31200c1aba549186e417fa6c300aa7b727d5f2d6. bartc
now accepts a seed
argument that will yield reproducible results when running with multiple threads. Requires installing dbarts from github.
Example:
set.seed(1)
library(bartCause)
## fit a simple linear model
n <- 100L
beta.z <- c(.75, -0.5, 0.25)
beta.y <- c(.5, 1.0, -1.5)
sigma <- 2
# set.seed(725)
x <- matrix(rnorm(3 * n), n, 3)
tau <- rgamma(1L, 0.25 * 16 * rgamma(1L, 1 * 32, 32), 16)
p.score <- pnorm(x %*% beta.z)
z <- rbinom(n, 1, p.score)
mu.0 <- x %*% beta.y
mu.1 <- x %*% beta.y + tau
y <- mu.0 * (1 - z) + mu.1 * z + rnorm(n, 0, sigma)
# low parameters only for example
fit <- bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L, seed = 12345)
print(summary(fit))
fitting treatment model via method 'bart'
fitting response model via method 'bart'
Call: bartc(response = y, treatment = z, confounders = x, seed = 12345,
n.samples = 100L, n.burn = 15L, n.chains = 2L)
Causal inference model fit by:
model.rsp: bart
model.trt: bart
Treatment effect (pate):
estimate sd ci.lower ci.upper
ate -0.4052 0.636 -1.652 0.8414
Estimates fit from 100 total observations
95% credible interval calculated by: normal approximation
population TE approximated by: posterior predictive distribution
Result based on 100 posterior samples times 2 chains
fit <- bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L, seed = 12345)
print(summary(fit))
fitting treatment model via method 'bart'
fitting response model via method 'bart'
Call: bartc(response = y, treatment = z, confounders = x, seed = 12345,
n.samples = 100L, n.burn = 15L, n.chains = 2L)
Causal inference model fit by:
model.rsp: bart
model.trt: bart
Treatment effect (pate):
estimate sd ci.lower ci.upper
ate -0.4052 0.636 -1.652 0.8414
Estimates fit from 100 total observations
95% credible interval calculated by: normal approximation
population TE approximated by: posterior predictive distribution
Result based on 100 posterior samples times 2 chains
Dear @vdorie , I am using
bartc
and notice that the results would differ slightly even with the sameset.seed()
specified in the beginning. Please refer to a short example below that I adopted from the documentation example, with the output ofsummary.bartcFit
.Meanwhile, according to the package documentation, the fitted object will contain a component
seed
, which seems to be.Random.seed
after the code execution. May I ask how I can use that seed for reproducible code?Thank you for the consideration!