mdscheuerell / gretaDFA

Using the greta package for R to fit Dynamic Factor Models
https://mdscheuerell.github.io/gretaDFA/
2 stars 1 forks source link

problem with `model()` and `mcmc()` #6

Open mdscheuerell opened 6 years ago

mdscheuerell commented 6 years ago

Hey @goldingn, I finally found some time to return to the DFA model in greta, and I ran into a bunch of errors associated with the code changes in the new ver. I think I've got most of them ironed out now, but I can't seem to figure out the following problem with model()

Specifically, this line causes the following error (with traceback):

Error in parse(text = self$operation) : cannot coerce type 'closure' to vector of type 'character'
18. parse(text = self$operation)
17. eval(parse(text = self$operation), envir = self$tf_function_env)
16. self$tf(dag)
15. x$define_tf(dag)
14. FUN(X[[i]], ...)
13. lapply(self$children[which(!children_defined)], function(x) x$define_tf(dag))
12. x$define_tf(self)
11. FUN(X[[i]], ...)
10. lapply(target_nodes, function(x) x$define_tf(self))
9. force(expr)
8. tryCatchList(expr, classes, parentenv, handlers)
7. tryCatch(force(expr), finally = { data$`__exit__`(NULL, NULL, NULL) })
6. with.python.builtin.object(self$tf_graph$as_default(), expr)
5. with(self$tf_graph$as_default(), expr)
4. self$on_graph(lapply(target_nodes, function(x) x$define_tf(self)))
3. self$define_tf_body()
2. dag$define_tf()
1. model(xx_est, ZZ_est, RR_est, sigma_est)

Simplifying the call to

mod_fit <- model(sigma_est)

seems to work (i.e., it doesn't return an error), but then the following call to mcmc() causes the following error (with traceback):

Error in py_get_attr_impl(x, name, silent) : AttributeError: 'module' object has no attribute 'distributions'
29. stop(structure(list(message = "AttributeError: 'module' object has no attribute 'distributions'", call = py_get_attr_impl(x, name, silent), cppstack = structure(list( file = "", line = -1L, stack = c("1 reticulate.so 0x0000000108b0af9b _ZN4Rcpp9exceptionC2EPKcb + 219", "2 reticulate.so 0x0000000108b11a35 _ZN4Rcpp4stopERKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE + 53", ...
28. py_get_attr_impl(x, name, silent)
27. py_get_attr(x, name)
26. py_get_attr_or_item(x, name, TRUE)
25. `$.python.builtin.object`(x, name)
24. `$.python.builtin.module`(tfp, distributions)
23. tfp$distributions
22. self$tf_distrib(parameters, dag)
21. self$tf_log_density_function(tf_target, tf_parameters, dag)
20. (function (tf_target) { tf_parameters <- self$tf_fetch_parameters(dag) target_params <- match_batches(c(list(tf_target), tf_parameters)) ...
19. (function (what, args, quote = FALSE, envir = parent.frame()) { if (!is.list(args)) stop("second argument must be a list") ...
18. mapply(do.call, density_functions, target_lists, MoreArgs = list(envir = tfe), SIMPLIFY = FALSE)
17. dag$define_joint_density()
16. force(expr)
15. tryCatchList(expr, classes, parentenv, handlers)
14. tryCatch(force(expr), finally = { data$`__exit__`(NULL, NULL, NULL) })
13. with.python.builtin.object(self$tf_graph$as_default(), expr)
12. with(self$tf_graph$as_default(), expr)
11. dag$on_graph(dag$define_joint_density())
10. self$valid_parameters(inits)
9. FUN(X[[i]], ...)
8. lapply(init_list, self$check_initial_values)
7. self$set_initial_values(initial_values)
6. super$initialize(initial_values = initial_values, model = model, parameters = parameters, seed = seed)
5. .subset2(public_bind_env, "initialize")(...)
4. sampler$class$new(initial_values, model, sampler$parameters, seed = seed)
3. FUN(X[[i]], ...)
2. lapply(initial_values_split, build_sampler, sampler, model)
1. mcmc(mod_fit, sampler = hmc(Lmin = 5, Lmax = 10, epsilon = 0.1, diag_sd = 1), warmup = 2000, n_samples = 5000, thin = 10, chains = 1, verbose = FALSE)

Ideas?

goldingn commented 6 years ago

Right, it's actually due to this line, you now need to pass in a string for the tensorflow function (for annoying reasons to do with parallel processing and pointers to python modules). So you can just change it to "tf$cumsum".

Though actually you don't need to define that operation anymore, since the dev version now has an apply() function for greta arrays, and you can use "cumsum" as the function to apply. It'll be something like: t(apply(x, 2, "cumsum"))

mdscheuerell commented 6 years ago

@goldingn: Ah, OK. That line was originally the char string "tf$cumsum", but then I had to change it to get things to work with an intermediate ver of greta.

Anyway, I'll just simplify things to use apply().

goldingn commented 6 years ago

Yeah that's right, I converted it back to passing the function for a while before I realised that precluded using future for parallelism!

mdscheuerell commented 6 years ago

@goldingn: OK, making the above change to apply() works for this line

mod_fit <- model(xx_est, ZZ_est, RR_est, sigma_est)

but the next call to mcmc() still fails with the same

Error in py_get_attr_impl(x, name, silent)...

mentioned above.

goldingn commented 6 years ago

Looks like a TensorFlow API change 🙄

What's the output of:

tensorflow::tf_version()

and

pkg <- reticulate::import("pkg_resources")
pkg$get_distribution("tensorflow_probability")$version

?

mdscheuerell commented 6 years ago

@goldingn

> tensorflow::tf_version()
[1] ‘1.8’

and

> pkg <- reticulate::import("pkg_resources")
> pkg$get_distribution("tensorflow_probability")$version
[1] "0.0.1"

So perhaps I need to update tensorflow? If so, I may have to wait until I can do this from my laptop.

goldingn commented 6 years ago

Yeah, it's worth updating tensorflow, but you definitely need to update tensorflow probability. I think that's where the problem lies.

goldingn commented 6 years ago

I'll try to add explicit checking of the TFP version