Closed njtierney closed 2 years ago
So far, tracing this back to
valid_parameters = function(parameters) {
browser()
dag <- self$model$dag
# tfe <- dag$tf_environment
#
# if (!live_pointer("joint_density_adj", envir = tfe)) {
# # dag$on_graph(
# # dag$define_joint_density()
# # )
# }
#
# dag$send_parameters(parameters)
# ld <- self$model$dag$log_density()
tf_parameters <- fl(array(
data = parameters,
dim = c(1, length(parameters))
))
ld <- lapply(
dag$tf_log_prob_function(tf_parameters),
as.numeric
)
is.finite(ld$adjusted) && is.finite(ld$unadjusted)
},
in inference_class.R
where tf_parameters
is
tf.Tensor([[ 0.04595352 -0.12635538 -0.02149962]], shape=(1, 3), dtype=float64)
And then running tf_log_prob_function(tf_parameters)
returns
tf.Tensor(
[[[-inf]
[-inf]
[-inf]]], shape=(1, 3, 1), dtype=float64)
Similarly, in test_joint.R
we get these errors:
## related to the above - test_joint.R ========================================
devtools::load_all(".")
#> ℹ Loading greta
#> ℹ Initialising python and checking dependencies, this may take a moment.
#>
#> ✔ Initialising python and checking dependencies ... done!
#> Loaded Tensorflow version 2.9.2
source("tests/testthat/helpers.R")
## error 1 - Error: Could not find reasonable starting values after 20 attempts.
obs <- matrix(rbinom(300, 1, 0.5), 100, 3)
probs <- variable(0, 1, dim = 3)
distribution(obs) <- joint(
bernoulli(probs[1]),
bernoulli(probs[2]),
bernoulli(probs[3]),
dim = 100
)
sample_distribution(probs)
#> Error: Could not find reasonable starting values after 20 attempts.
#> Please specify initial values manually via the `initial_values` argument
## error 2 - Error: all(above_lower & below_upper) is not TRUE
x <- joint(
normal(0, 1, truncation = c(0, Inf)),
normal(0, 2, truncation = c(-Inf, 0)),
normal(-1, 1, truncation = c(1, 2))
)
sample_distribution(x, lower = c(0, -Inf, 1), upper = c(Inf, 0, 2))
#> Error: all(above_lower & below_upper) is not TRUE
#>
#> `actual`: FALSE
#> `expected`: TRUE
## error 3 - Error: Could not find reasonable starting values after 20 attempts.
x <- joint(
uniform(0, 1),
uniform(0, 2),
uniform(-1, 0)
)
sample_distribution(x, lower = c(0, 0, -1), upper = c(1, 2, 0))
#> Error: Could not find reasonable starting values after 20 attempts.
#> Please specify initial values manually via the `initial_values` argument
# intriguingly, this also fails:
sample_distribution(uniform(0,1))
#> Error: Could not find reasonable starting values after 20 attempts.
#> Please specify initial values manually via the `initial_values` argument
Created on 2022-10-04 by the reprex package (v2.0.1)
There's an interesting clue here, in test_joint.R, I was noticing a pattern and here's a small example of some distribution code that fails.
sample_distribution(uniform(0,1))
which seems crazy, given that uniform should be easy to sample? can we run mcmc on it?
uniform_samples <- mcmc(model(uniform(0,1)))
This errors - what?
But we can sample a normal, like below, which is good
normal_samples <- mcmc(model(normal(0,1)))
So this is happening at a pretty deep place I guess. Looking at the trace:
# 1: mcmc(model(uniform(0, 1)))
# 2: inference.R#201: with(tf$device(compute_options), {
# trace_batch_size
# 3: with.python.builtin.object(tf$device(compute_options), {
# trace_batch
# 4: tryCatch(force(expr), finally = {
# data$`__exit__`(NULL, NULL, NULL)
# }
# 5: tryCatchList(expr, classes, parentenv, handlers)
# 6: force(expr)
# 7: inference.R#258: lapply(initial_values_split, build_sampler, sampler, m
# 8: FUN(X[[i]], ...)
# 9: utils.R#554: sampler$class$new(initial_values, model, sampler$parameter
# 10: initialize(...)
# 11: inference_class.R#302: super$initialize(initial_values = initial_values
# 12: inference_class.R#51: self$set_initial_values(initial_values)
# 13: inference_class.R#154: lapply(init_list, self$check_initial_values)
# 14: FUN(X[[i]], ...)
So this is happening at self$check_initial_values
self$check_initial_values
does the following
checks the values are intialised/set - for those that aren't
it samples a small value from rnorm
then it checks that the values are valid with self$valid_parameters
self$valid_parameters
does the following:
valid_parameters = function(parameters) {
dag <- self$model$dag
tf_parameters <- fl(array(
data = parameters,
dim = c(1, length(parameters))
))
ld <- lapply(
dag$tf_log_prob_function(tf_parameters),
as.numeric
)
is.finite(ld$adjusted) && is.finite(ld$unadjusted)
my understanding of this (tf_log_prob_function
) does
self$generate_log_prob_function()
which has a lot of code to fit in here.The issue is that the values returned by tf_log_prob_function
are -Inf or Inf for when sampling from uniform, but not for normal so when we run
uniform_samples <- mcmc(model(uniform(0,1)))
and trace this through to the valid_parameters
function, here tf_parameters
contains:
tf.Tensor([[-0.06387527]], shape=(1, 1), dtype=float64)
and then dag$tf_log_prob_function(tf_parameters)
returns
$adjusted
tf.Tensor([-inf], shape=(1), dtype=float64)
$unadjusted
tf.Tensor([-inf], shape=(1), dtype=float64)
But this is what happens with normal
normal_samples <- mcmc(model(normal(0,1)))
tf_parameters
tf.Tensor([[-0.11163191]], shape=(1, 1), dtype=float64)
lapply(
dag$tf_log_prob_function(tf_parameters),
as.numeric
)
gives
# $adjusted
# [1] -0.9251694
#
# $unadjusted
# [1] -0.9251694
So this tells us that something is going on with tf_log_prob_function
Need to investigate more there
Is tf_parameters
supposed to be the free state? Or the actual values for the parameters? I think from the context it's supposed to be the free state.
-0.06387527 would be an invalid suggestion for the actual parameter of U(0, 1), so it should return Inf
when checking the density. But that should never happen; the rnorm()
bit should be generating proposals on the free state, and then inside tf_log_prob_function()
those should be transformed into appropriate parameter values (so converted to the unit interval via a logit transform in the case of uniform). It's worth double checking that that is happening. there was some funny stuff with the chained bijectors (which do this transformation) before, so maybe that's where the bug is for this one?
Thanks, Nick!
Tracing the function through the debugger I believe tf_parameters
is the free state.
-0.06387527 would be an invalid suggestion for the actual parameter of U(0, 1), so it should return Inf when checking the density.
Yup that makes sense
But that should never happen; the
rnorm()
bit should be generating proposals on the free state, and then insidetf_log_prob_function()
those should be transformed into appropriate parameter values (so converted to the unit interval via a logit transform in the case of uniform). It's worth double checking that that is happening. there was some funny stuff with the chained bijectors (which do this transformation) before, so maybe that's where the bug is for this one?
Hmmm! So I wonder if that is supposed to happen inside of evaluate_density
?
Basically the parts inside of tf_log_prob_function
are:
# temporarily define a new environment
tfe_old <- self$tf_environment
on.exit(self$tf_environment <- tfe_old)
tfe <- self$tf_environment <- new.env()
# put the free state in the environment, and build out the tf graph
tfe$free_state <- free_state
# we now make all of the operations define themselves now
self$define_tf()
# define the densities
self$define_joint_density()
And then define_joint_density()
is
define_joint_density = function() {
# browser()
tfe <- self$tf_environment
# get all distribution nodes that have a target
distribution_nodes <- self$node_list[self$node_types == "distribution"]
target_nodes <- lapply(distribution_nodes, member, "get_tf_target_node()")
has_target <- !vapply(target_nodes, is.null, FUN.VALUE = TRUE)
distribution_nodes <- distribution_nodes[has_target]
target_nodes <- target_nodes[has_target]
# get the densities, evaluated at these targets
densities <- mapply(self$evaluate_density,
distribution_nodes,
target_nodes,
SIMPLIFY = FALSE
)
Then stepping into evaluate_density
- it looks like:
evaluate_density = function(distribution_node, target_node) {
tfe <- self$tf_environment
parameter_nodes <- distribution_node$parameters
# get the tensorflow objects for these
distrib_constructor <- self$get_tf_object(distribution_node)
tf_target <- self$get_tf_object(target_node)
tf_parameter_list <- lapply(parameter_nodes, self$get_tf_object)
# execute the distribution constructor functions to return a tfp
# distribution object
tfp_distribution <- distrib_constructor(tf_parameter_list, dag = self)
# browser()
self$tf_evaluate_density(tfp_distribution,
tf_target,
truncation = distribution_node$truncation,
bounds = distribution_node$bounds
)
},
debugging this, and running
self$tf_evaluate_density(tfp_distribution,
tf_target,
truncation = distribution_node$truncation,
bounds = distribution_node$bounds
)
Gives
tf.Tensor([[[-inf]]], shape=(1, 1, 1), dtype=float64)
So, then, stepping into self$tf_evaluate_density
We get:
tf_evaluate_density = function(tfp_distribution,
tf_target,
truncation = NULL,
bounds = NULL) {
# get the uncorrected log density
ld <- tfp_distribution$log_prob(tf_target)
# if required, calculate the log-adjustment to the truncation term of
# the density function i.e. the density of a distribution, truncated
# between a and b, is the non truncated density, divided by the integral
# of the density function between the truncation bounds. This can be
# calculated from the distribution's CDF
if (!is.null(truncation)) {
lower <- truncation[[1]]
upper <- truncation[[2]]
if (all(lower == bounds[1])) {
# if only upper is constrained, just need the cdf at the upper
offset <- tfp_distribution$log_cdf(fl(upper))
} else if (all(upper == bounds[2])) {
# if only lower is constrained, get the log of the integral above it
offset <- tf$math$log(fl(1) - tfp_distribution$cdf(fl(lower)))
} else {
# if both are constrained, get the log of the integral between them
offset <- tf$math$log(tfp_distribution$cdf(fl(upper)) -
tfp_distribution$cdf(fl(lower)))
}
ld <- ld - offset
}
ld
},
So debugging this, tf_target
is
Browse[5]> tf_target
tf.Tensor([[[1.47212097]]], shape=(1, 1, 1), dtype=float64)
And then we're at the point where we calculate the log_prob
ld <- tfp_distribution$log_prob(tf_target)
And it seems that perhaps tf_target
hasn't been transformed yet? Because then we get
Browse[5]> tfp_distribution$log_prob(tf_target)
tf.Tensor([[[-inf]]], shape=(1, 1, 1), dtype=float64)
And having a bit of a poke around https://github.com/greta-dev/greta/blob/master/R/dag_class.R
It looks like this hasn't changed much from the TF2 branch.
So I'm not quite sure how to solve this issue of ensuring that the values are appropriately transformed? I'm most likely missing something!
OK so I think we might be onto the right path regarding the issue with the chained bijector.
I'm just not sure how to get the TF code to evaluate so I can interactively debug the bijectors, which would make debugging it a lot easier.
OK so the bug was that
tf_scalar_neg_pos_bijector <- function(dim, lower, upper) {
tf_scalar_biject(
# tfp$bijectors$AffineScalar(shift = fl(lower), scale = fl(upper - lower)),
tfb_shift_scale(shift = fl(lower), scale = fl(upper - lower)),
tfp$bijectors$Sigmoid(),
dim = dim
)
}
had the wrong value for shift
- it was upper
. So we were getting a shift and scale of 1 and 1 for uniform...and other cases.
test_distributions.R now works
devtools::load_all(".")
#> ℹ Loading greta
#> ℹ Initialising python and checking dependencies, this may take a moment.
#>
#> ✔ Initialising python and checking dependencies ... done!
#> Loaded Tensorflow version 2.9.2
source("tests/testthat/helpers.R")
## error 1
# multivariate discrete
y <- extraDistr::rmnom(5, size = 4, prob = runif(3))
p <- uniform(0, 1, dim = 3)
distribution(y) <- multinomial(4, t(p), n_realisations = 5)
sample_distribution(p)
## error 2
alpha <- uniform(0, 10, dim = c(1, 5))
x <- dirichlet(alpha)
m <- model(x)
draws <- mcmc(m, n_samples = 100, warmup = 100, verbose = FALSE)
## error 3
n <- 10
k <- 3
# multinomial
size <- 5
x <- t(rmultinom(n, size, runif(k)))
p <- uniform(0, 1, dim = c(n, k))
distribution(x) <- multinomial(size, p)
m <- model(p)
expect_ok(draws <- mcmc(m, warmup = 0, n_samples = 5, verbose = FALSE))
## error 4
n <- 10
k <- 3
# categorical
x <- t(rmultinom(n, 1, runif(k)))
p <- uniform(0, 1, dim = c(n, k))
distribution(x) <- categorical(p)
m <- model(p)
expect_ok(draws <- mcmc(m, warmup = 0, n_samples = 5, verbose = FALSE))
Created on 2022-10-11 by the reprex package (v2.0.1)
and the tests for test_joint.R
now work
devtools::load_all(".")
#> ℹ Loading greta
#> ℹ Initialising python and checking dependencies, this may take a moment.
#>
#> ✔ Initialising python and checking dependencies ... done!
#> Loaded Tensorflow version 2.9.2
source("tests/testthat/helpers.R")
## error 1 - Error: Could not find reasonable starting values after 20 attempts.
obs <- matrix(rbinom(300, 1, 0.5), 100, 3)
probs <- variable(0, 1, dim = 3)
distribution(obs) <- joint(
bernoulli(probs[1]),
bernoulli(probs[2]),
bernoulli(probs[3]),
dim = 100
)
sample_distribution(probs)
## error 2 - Error: all(above_lower & below_upper) is not TRUE
x <- joint(
normal(0, 1, truncation = c(0, Inf)),
normal(0, 2, truncation = c(-Inf, 0)),
normal(-1, 1, truncation = c(1, 2))
)
sample_distribution(x, lower = c(0, -Inf, 1), upper = c(Inf, 0, 2))
## error 3 - Error: Could not find reasonable starting values after 20 attempts.
x <- joint(
uniform(0, 1),
uniform(0, 2),
uniform(-1, 0)
)
sample_distribution(x, lower = c(0, 0, -1), upper = c(1, 2, 0))
# intriguingly, this also fails:
sample_distribution(uniform(0,1))
Created on 2022-10-11 by the reprex package (v2.0.1)
NOTE: This is in the Tensorflow 2 development branch (https://github.com/greta-dev/greta/pull/534)
4 errors for the distributions,
multinomial
,dirichlet
, andcategorical
:Created on 2022-09-30 by the reprex package (v2.0.1)
Session info
``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22) #> os macOS Monterey 12.3.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Brisbane #> date 2022-09-30 #> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> abind 1.4-5 2016-07-21 [1] CRAN (R 4.2.0) #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0) #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.2.0) #> brio 1.1.3 2021-11-30 [1] CRAN (R 4.2.0) #> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.0) #> callr 3.7.2 2022-08-22 [1] CRAN (R 4.2.0) #> cli 3.3.0.9000 2022-06-15 [1] Github (r-lib/cli@31a5db5) #> coda 0.19-4 2020-09-30 [1] CRAN (R 4.2.0) #> codetools 0.2-18 2020-11-04 [1] CRAN (R 4.2.0) #> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0) #> desc 1.4.2 2022-09-08 [1] CRAN (R 4.2.0) #> devtools 2.4.4 2022-07-20 [1] CRAN (R 4.2.0) #> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) #> evaluate 0.16 2022-08-09 [1] CRAN (R 4.2.0) #> extraDistr 1.9.1 2020-09-07 [1] CRAN (R 4.2.0) #> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0) #> future 1.27.0 2022-07-22 [1] CRAN (R 4.2.0) #> globals 0.16.0 2022-08-05 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> P greta * 0.4.2.9000 2022-09-19 [?] load_all() #> here 1.0.1 2020-12-13 [1] CRAN (R 4.2.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0) #> hms 1.1.1 2021-09-26 [1] CRAN (R 4.2.0) #> htmltools 0.5.3 2022-07-18 [1] CRAN (R 4.2.0) #> htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.2.0) #> httpuv 1.6.5 2022-01-05 [1] CRAN (R 4.2.0) #> jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.2.0) #> knitr 1.40 2022-08-24 [1] CRAN (R 4.2.0) #> later 1.3.0 2021-08-18 [1] CRAN (R 4.2.0) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.2.0) #> lifecycle 1.0.2 2022-09-09 [1] CRAN (R 4.2.0) #> listenv 0.8.0 2019-12-05 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> Matrix 1.4-1 2022-03-23 [1] CRAN (R 4.2.0) #> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.2.0) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0) #> parallelly 1.32.1 2022-07-21 [1] CRAN (R 4.2.0) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.0) #> pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> pkgload 1.3.0 2022-06-27 [1] CRAN (R 4.2.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.2.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0) #> processx 3.7.0 2022-07-07 [1] CRAN (R 4.2.0) #> profvis 0.3.7 2020-11-02 [1] CRAN (R 4.2.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.2.0) #> promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.0) #> ps 1.7.1 2022-06-18 [1] CRAN (R 4.2.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) #> Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.2.0) #> remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0) #> reticulate 1.25 2022-05-11 [1] CRAN (R 4.2.0) #> rlang 1.0.5 2022-08-31 [1] CRAN (R 4.2.0) #> rmarkdown 2.16 2022-08-24 [1] CRAN (R 4.2.0) #> rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> shiny 1.7.2 2022-07-19 [1] CRAN (R 4.2.0) #> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.0) #> stringr 1.4.1 2022-08-20 [1] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0) #> tensorflow 2.9.0 2022-05-21 [1] CRAN (R 4.2.0) #> testthat * 3.1.4 2022-04-26 [1] CRAN (R 4.2.0) #> tfautograph 0.3.2 2021-09-17 [1] CRAN (R 4.2.0) #> tfruns 1.5.0 2021-02-26 [1] CRAN (R 4.2.0) #> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.2.0) #> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.0) #> usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0) #> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0) #> whisker 0.4 2019-08-28 [1] CRAN (R 4.2.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.33 2022-09-12 [1] CRAN (R 4.2.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0) #> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0) #> yesno 0.1.2 2020-07-10 [1] CRAN (R 4.2.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library #> #> P ── Loaded and on-disk path mismatch. #> #> ─ Python configuration ─────────────────────────────────────────────────────── #> python: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/bin/python #> libpython: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/libpython3.8.dylib #> pythonhome: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2:/Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2 #> version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:05:16) [Clang 12.0.1 ] #> numpy: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/python3.8/site-packages/numpy #> numpy_version: 1.22.4 #> tensorflow: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/python3.8/site-packages/tensorflow #> #> NOTE: Python version was forced by use_python function #> #> ────────────────────────────────────────────────────────────────────────────── ```Most likely some issue with how the distributions are specified
This will be resolved in the large PR, #534