Closed jwdink closed 7 years ago
Hi @jwdink,
Thanks for letting me know!
That's annoying. It runs fine for me though, takes less than a second to compile and converges quite nicely:
I'm using v0.2.2 (current release at greta-dev/master), though I suspect this is some strange a system issue, rather than a version issue.
Presumably simple code compiles for you? E.g.:
m <- model(normal(0, 1))
What the output of sessionInfo()
?
BTW, this model would be much more efficient (for larger data and in greta) if we could replace the for loop with cumsum()
. Only I haven't provided a cumsum
method for greta. D'oh!
I'll open that as a separate issue.
Not fixed after upgrading to dev-version. :(
Here's my session info (from right before upgrading):
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] greta_0.2.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 lattice_0.20-35 prettyunits_1.0.2 assertthat_0.2.0 digest_0.6.12 withr_2.0.0 grid_3.4.1
[8] R6_2.2.2 jsonlite_1.5 magrittr_1.5 coda_0.19-1 tfruns_0.9 progress_1.1.2 reticulate_1.0
[15] devtools_1.13.3 tools_3.4.1 parallel_3.4.1 compiler_3.4.1 tensorflow_1.3 memoise_1.1.0
I very recently installed greta (~1 week ago) and along with it tensorflow, so I doubt it's an issue with an old tensorflow version-- but let me know how I can get the version there if you think that's relevant.
EDIT: And yes, I can get other models to compile.
hmmmm...
Did the simple example run? Did you restart the R session before re-trying?
You can get the tensorflow version greta is looking at with:
library(tensorflow)
tf$`__version__`
I have 1.2.0
Does the following code (which requires greta >= v0.2.1) run without hanging?
dag <- .internals$inference$dag_class$new(list(latent_positions))
dag
code ran without hanging.Hmmm, I just upgraded to 1.3.0 and that works fine for me too. You could try 1.2.0, but I suspect that isn't it.
You could try passing the following arguments to model()
:
compiled_model <- model(latent_positions, compile = FALSE, n_cores = 1L)
but I'm grasping at straws really
I have the same versions of reticulate, tensorflow (R package) and R6.
I expect something's bugging out whilst translating the greta model to tensorflow code, so The following code will probably hang too, but would be worth checking:
dag <- .internals$inference$dag_class$new(list(latent_positions))
dag$define_tf()
Is there any CPU activity when the model is compiling, or is it just sitting there?
I occasionally get the following message after aborting when it hangs:
Error in py_call_impl(callable, dots$args, dots$keywords) :
No such namespace: reticulate.
And yep, dag$define_tf()
hangs too. There is definitely CPU activity.
Hmm, yeah I get those error messages when aborting too. Part of reticulate's error handling so I don't think it's connected.
How long have you tried leaving it for? I'm just wondering if it's running really slow, rather than stopping altogether.
Defining the tensorflow graph can be slow when there is recursion (for loops particularly), though this is a small one
I left it for a good 30 minutes-- I had assumed this was definitely long enough, just because everything else I'd ever tried compiled in seconds.
Sounds like we're stumped for now! But let me know if you have anything else you think I should try.
Yeah, that's definitely too long to be normal behaviour.
One last-ditch attempt. Here's a version that removes the for loop by locally defining cumsum()
for greta arrays. It reduces the model from 48 to 33 nodes. There's a chance it may have removed whatever was the problem node:
library (greta)
library (tensorflow)
op <- .internals$nodes$constructors$op
cumsum.greta_array <- function (x)
op("cumsum", x, tf_operation = tf$cumsum)
times <- 1:5
Y <- c(-3.12368290898237, -2.37268902597084, -3.31757075118491, -2.0882385972953, -2.23854863652599)
nobs <- length(Y)
sigma_measure <- lognormal(0, 1)
sigma_delta <- lognormal(0, 1);
sigma_position <- lognormal(0, 1);
latent_delta_sd <- c(ones(1), rep(sigma_delta, nobs - 1))
latent_delta <- normal(0, latent_delta_sd)
latent_delta <- cumsum(latent_delta)
latent_positions_sd <- c(ones(1), rep(sigma_position, nobs - 1))
latent_positions_mean <- c(-2, rep(0, nobs - 1))
latent_positions <- normal(latent_positions_mean, latent_positions_sd)
latent_positions <- cumsum(latent_positions) + latent_delta
# observed position:
distribution(Y) <- normal(latent_positions, sigma_measure);
compiled_model <- model(latent_positions)
This runs fine for me (and should be much quicker for big N)
It works!!!
Aha! Brilliant! Now I wonder why that is...
Since I can't replicate the bug, I'll close this issue. I'll put cumsum and cumprod in the next github release.
I know this issue is closed, but just to clarify...
On the third to last line, you wrote:
latent_positions <- cumsum(latent_positions) + latent_delta
...but this should be...
latent_positions <- cumsum(latent_positions + latent_delta)
...right?
(Not trying to be pedantic, just wanting to make sure I'm specifying these kinds of models correctly.)
Right, latent_delta
had already been cumsum
-ed, so my code was doing cumsum(a) + cumsum(b)
, but the stan code is equivalent to cumsum(a + cumsum(b))
, which is what you have above. 👍
Also, thanks for trying greta out and reporting bugs! Feel free to post feature requests too :)
For sure-- thanks for creating such a helpful package and being such an active maintainer!
BTW, it's worth noting that, with the cumsum
modification above, the model does not sample well, and doesn't seem to converge. I've tried messing with the priors and setting inits, to no avail. The Stan equivalent does seem to sample well. I know HMC and NUTS aren't identical, but wasn't sure if there was an obvious change to the model you might recommend that would improve things.
Hmmm, Id be happy to take a look. Could you repost the poorly-converging greta version to make sure we're on the same page?
This version converges pretty well for me:
rm(list = ls())
library (greta)
library (tensorflow)
op <- .internals$nodes$constructors$op
cumsum.greta_array <- function (x)
op("cumsum", x, tf_operation = tf$cumsum)
times <- 1:5
Y <- c(-3.12368290898237, -2.37268902597084, -3.31757075118491, -2.0882385972953, -2.23854863652599)
nobs <- length(Y)
sigma_measure <- lognormal(0, 1)
sigma_delta <- lognormal(0, 1);
sigma_position <- lognormal(0, 1);
latent_delta_sd <- c(ones(1), rep(sigma_delta, nobs - 1))
latent_delta <- normal(0, latent_delta_sd)
latent_delta <- cumsum(latent_delta)
latent_positions_sd <- c(ones(1), rep(sigma_position, nobs - 1))
latent_positions_mean <- c(-2, rep(0, nobs - 1))
latent_positions <- normal(latent_positions_mean, latent_positions_sd)
latent_positions <- cumsum(latent_positions + latent_delta)
# observed position:
distribution(Y) <- normal(latent_positions, sigma_measure);
compiled_model <- model(latent_positions)
draws_list <- replicate(3, mcmc(compiled_model))
draws <- coda::mcmc.list(draws_list)
bayesplot::mcmc_trace(draws)
sirt::mcmc.list.descriptives(draws)[, "Rhat"]
[1] 0.9981314 1.0045945 0.9994606 1.0115534 1.0017439
I apologize, I'd forgotten I'd posted such a small subset of the data (just to demonstrate the model-compilation issue).
The sampling issues come with the full dataset:
library (greta)
library (tensorflow)
op <- .internals$nodes$constructors$op
cumsum.greta_array <- function (x)
op("cumsum", x, tf_operation = tf$cumsum)
times <- 1:92
Y <- c(-3.12, -2.37, -3.32, -2.09, -2.24, -1.7, -1.84, -2.24, -1.14,
-1.7, -1.22, -1.17, -1, -1.1, -1.32, -1.59, 0.73, -1.18, -0.58,
-1.01, 0.01, -1.28, -0.4, -0.15, -0.2, 0.69, -0.26, -1.13, 0.43,
0.62, 0.88, -0.25, -0.2, 0.61, 0.73, 0.99, -0.46, -1.23, -0.19,
-0.64, 0.1, -0.03, -0.72, -0.5, -0.2, -0.72, -0.18, -0.3, -0.16,
-1.17, -0.58, 0.81, -0.29, -0.52, -0.55, -0.25, 1.06, 0.09, 1.13,
0.72, 0.13, -0.36, 1.02, 0.74, 0.47, 0.6, -0.22, 1.25, 1.56,
0.01, 0.15, 1.28, -0.03, 0.32, 0.22, 0.86, 1.44, -0.5, 1.36,
0.77, 1.6, 0.16, 0.52, -0.52, 0.41, 1.77, 1.2, 0.86, 1.67, 1.2,
0.68, 1.95, 1.55, 1.96, 2.42, 1.83)
#times <- times[1:60]; Y <- Y[1:60] # as ts-length decreases, sampling improves
nobs <- length(Y)
sigma_measure <- lognormal(0, 1)
sigma_delta <- lognormal(0, 1);
sigma_position <- lognormal(0, 1);
latent_delta_sd <- c(ones(1), rep(sigma_delta, nobs - 1))
latent_delta <- normal(0, latent_delta_sd)
latent_delta <- cumsum(latent_delta)
# latent_delta <- zeros(nobs) # uncomment to make sampling work
latent_positions_sd <- c(ones(1), rep(sigma_position, nobs - 1))
latent_positions_mean <- c(-2, rep(0, nobs - 1))
latent_positions <- normal(latent_positions_mean, latent_positions_sd)
latent_positions <- cumsum(latent_positions + latent_delta)
# observed position:
distribution(Y) <- normal(latent_positions, sigma_measure);
compiled_model <- model(latent_positions)
draws_list <- replicate(3, mcmc(compiled_model, warmup = 250))
draws_subset <- coda::mcmc.list(purrr::map(draws_list, ~.x[,1:5]))
bayesplot::mcmc_trace(draws_subset)
I noticed that the problems increase as the length of the time-series increases from the original subset of 5 to the full length, with no sharp point where sampling-performance suddenly drops, but instead a gradual degredation.
I also noticed that removing the 'delta' component makes the problems go away. However, sharpening the prior on sigma_delta does not help.
I didn't have any luck increasing warmup or specifying manual inits.
Hi @jwdink Sorry I dropped the ball on this!
There are a couple of things going on here.
Stan defines variables on the scale of the final versions of latent_delta
and latent_positions
; i.e. after doing the cumulative sum operation. Whereas greta defines variables on the scale the first versions of latent_delta
and latent_positions
, then does the cumulative sum operation to modify them. That doesn't affect the model, but it does affect the sampler, so could be part of the difference in sampling efficiency for this particular model.
Issue 1 might be interacting with a funnel-shaped posterior effect that you sometimes get in hierarchical models. That can be ameliorated somewhat by using a centring reparameterisation of the model.
Stan (by default) runs HMC on a 'diagonal euclidean metric', which it estimates during the warmup period. In other words, sampling can be slow if the marginal posterior variances are higher for some parameters than others. So Stan empirically estimates the marginal variance of each of the variables, and adjusts the sampler so that steps are longer in the direction of variables with greater variances.
On the dev branch of the goldingn fork, I have now implemented HMC on a diagonal Euclidean metric, with estimation of the metric during warmup. The fiddly bit was actually making it play nicely with the stepsize adaptation.
On that branch, I have also added a chains
argument to mcmc()
, and exported the cumsum
function. So here's a script that resolves issues 2 and 3 for your model:
# devtools::install_github("goldingn/greta@dev")
library (greta)
set.seed(2017-09-13)
times <- 1:92
Y <- c(-3.12, -2.37, -3.32, -2.09, -2.24, -1.7, -1.84, -2.24, -1.14,
-1.7, -1.22, -1.17, -1, -1.1, -1.32, -1.59, 0.73, -1.18, -0.58,
-1.01, 0.01, -1.28, -0.4, -0.15, -0.2, 0.69, -0.26, -1.13, 0.43,
0.62, 0.88, -0.25, -0.2, 0.61, 0.73, 0.99, -0.46, -1.23, -0.19,
-0.64, 0.1, -0.03, -0.72, -0.5, -0.2, -0.72, -0.18, -0.3, -0.16,
-1.17, -0.58, 0.81, -0.29, -0.52, -0.55, -0.25, 1.06, 0.09, 1.13,
0.72, 0.13, -0.36, 1.02, 0.74, 0.47, 0.6, -0.22, 1.25, 1.56,
0.01, 0.15, 1.28, -0.03, 0.32, 0.22, 0.86, 1.44, -0.5, 1.36,
0.77, 1.6, 0.16, 0.52, -0.52, 0.41, 1.77, 1.2, 0.86, 1.67, 1.2,
0.68, 1.95, 1.55, 1.96, 2.42, 1.83)
nobs <- length(Y)
sigma_measure <- lognormal(0, 1)
sigma_delta <- lognormal(0, 1);
sigma_position <- lognormal(0, 1);
latent_delta_sd <- c(ones(1), rep(sigma_delta, nobs - 1))
latent_delta_raw <- normal(0, 1, dim = nobs)
latent_delta <- latent_delta_raw * latent_delta_sd
latent_delta <- cumsum(latent_delta)
latent_positions_sd <- c(ones(1), rep(sigma_position, nobs - 1))
latent_positions_mean <- c(-2, rep(0, nobs - 1))
latent_positions_raw <- normal(0, 1, dim = nobs)
latent_positions <- latent_positions_mean + latent_positions_raw * latent_positions_sd
latent_positions <- cumsum(latent_positions + latent_delta)
# observed position:
distribution(Y) <- normal(latent_positions, sigma_measure);
compiled_model <- model(latent_positions)
draws <- mcmc(compiled_model, warmup = 1000, n_samples = 1000, chains = 2)
bayesplot::mcmc_trace(draws, pars = paste0("latent_positions", 1:5))
It's better, but not great. I'll see if I can optimise the sampler some more. Note the increased number of warmup iterations that are needed to get good estimates of the metric; Stan uses 2000 by default.
Thanks for following up! Really appreciate how much you've looked into this.
Oddly enough, though, when I run your exact code, both chains are flat (i.e., give the same value for all samples). I set the seed in the same way too...
This is a new computer, so freshly installed tensorflow etc.
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] greta_0.2.2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 bindr_0.1 magrittr_1.5 progress_1.1.2 munsell_0.4.3 colorspace_1.3-2 lattice_0.20-35 R6_2.2.2 rlang_0.1.2
[10] stringr_1.2.0 plyr_1.8.4 dplyr_0.7.3 tools_3.4.1 bayesplot_1.3.0 parallel_3.4.1 grid_3.4.1 gtable_0.2.0 coda_0.19-1
[19] tfruns_0.9.1 digest_0.6.12 lazyeval_0.2.0 assertthat_0.2.0 tibble_1.3.4 tensorflow_1.3.1 bindrcpp_0.2 reshape2_1.4.2 ggplot2_2.2.1
[28] glue_1.1.1.9000 labeling_0.3 stringi_1.1.5 compiler_3.4.1 scales_0.5.0 prettyunits_1.0.2 reticulate_1.1 jsonlite_1.5 pkgconfig_2.0.1
> tensorflow::tf_version()
[1] ‘1.3’
Ouch, yeah it looks like I broke the sampler adaptation on the dev branch since I posted that - I'm working on running multiple chains in parallel.
This commit is what I would have been referring to, and now runs for this model: devtools::install_github("goldingn/greta@ddd3a54")
I'll get this all patched up and merged into a working release. Thanks for spotting this and letting me know!
Bug now fixed on devtools::install_github("goldingn/greta@dev")
, in case you want to play with the new evaluate()
feature :)
Hopefully I'm not making some silly mistake here-- assuming that's not the case...
It seems like 'dynamic factor' style models (like this simplified kalman-filter below) won't compile: when I call
model
, R just hangs.In contrast, everything seems to go well in stan: