greta-dev / greta

simple and scalable statistical modelling in R
https://greta-stats.org
Other
529 stars 63 forks source link

Some optimisers not initialized correctly #244

Open stephensrmmartin opened 6 years ago

stephensrmmartin commented 6 years ago

Using either gradient_descent or momentum (or, possibly others; these are the two I noticed would fail) will cause an error:

Error in while (self$it < self$max_iterations & self$diff > self$tolerance) { : 
  missing value where TRUE/FALSE needed

This seems to be due to self$diff being NaN, which R counts as missing. Manually altering self$diff to be non-missing appears to correct the problem. This implies that self$diff is not initialized correctly.

However, when using momentum, this is still set to NaN at some point during the loop, and will cause the loop to fail.

goldingn commented 6 years ago

Thanks for letting me know!

These all pass their tests though, so I'm not sure how to recreate the issue. Could you please post a reproducible example here so I know where to start?

stephensrmmartin commented 6 years ago
library(lavaan)
library(greta)
data(package='psych','bfi')
ds.a <- bfi[,1:5]
ds.a <- ds.a[complete.cases(ds.a),]
ds.a$A1 <- 7 - ds.a$A1
#ds.a <- ds.a[1:100,]
ds.a <- scale(ds.a)
ds.g <- as_data(ds.a)

N <- nrow(ds.a)
J <- ncol(ds.a)

# Latent CFA
theta <- normal(0,1,c(N,1))
nu <- normal(0,2,c(J,1))
lambda <- normal(0,1,c(J,1),truncation=c(0,Inf))
resid <- normal(0,2,c(J,1),truncation=c(0,Inf))

mu <- ones(N)%*%t(nu) + theta%*%t(lambda)
Sigma <- zeros(J,J)
diag(Sigma) <- resid

distribution(ds.g) <- multivariate_normal(mu,Sigma)
#distribution(ds.a[,1]) <- normal(mu[,1],resid[1])
#distribution(ds.a[,2]) <- normal(mu[,2],resid[2])
#distribution(ds.a[,3]) <- normal(mu[,3],resid[3])
#distribution(ds.a[,4]) <- normal(mu[,4],resid[4])
#distribution(ds.a[,5]) <- normal(mu[,5],resid[5])

gretaMod <- model(lambda,nu,resid,theta)

gretaOptOut <- greta::opt(gretaMod,optimiser=gradient_descent(),max_iterations = 2000)

I was using the above, I believe. My intuition is that some constraint makes TF spit out NaN, which is then saved into the object, but breaks the R loop. Why some optimisers do that, I'm not sure; perhaps some optimisers just don't get into those problematic regions.

goldingn commented 5 years ago

OK great, yes I think you're right about the cause. I'll see if I can catch those on the TensorFlow side.

njtierney commented 3 years ago

I get this error when running the current version of {greta}

library(lavaan)
#> This is lavaan 0.6-9
#> lavaan is FREE software! Please report any bugs.
library(greta)
#> 
#> Attaching package: 'greta'
#> The following objects are masked from 'package:stats':
#> 
#>     binomial, cov2cor, poisson
#> The following objects are masked from 'package:base':
#> 
#>     %*%, apply, backsolve, beta, chol2inv, colMeans, colSums, diag,
#>     eigen, forwardsolve, gamma, identity, rowMeans, rowSums, sweep,
#>     tapply
data(package='psych','bfi')
ds.a <- bfi[,1:5]
ds.a <- ds.a[complete.cases(ds.a),]
ds.a$A1 <- 7 - ds.a$A1
#ds.a <- ds.a[1:100,]
ds.a <- scale(ds.a)
ds.g <- as_data(ds.a)
#> ℹ Initialising python and checking dependencies
#> ✓ Initialising python and checking dependencies
#> 

N <- nrow(ds.a)
J <- ncol(ds.a)

# Latent CFA
theta <- normal(0,1,c(N,1))
nu <- normal(0,2,c(J,1))
lambda <- normal(0,1,c(J,1),truncation=c(0,Inf))
resid <- normal(0,2,c(J,1),truncation=c(0,Inf))

mu <- ones(N)%*%t(nu) + theta%*%t(lambda)
Sigma <- zeros(J,J)
diag(Sigma) <- resid

distribution(ds.g) <- multivariate_normal(mu,Sigma)
#distribution(ds.a[,1]) <- normal(mu[,1],resid[1])
#distribution(ds.a[,2]) <- normal(mu[,2],resid[2])
#distribution(ds.a[,3]) <- normal(mu[,3],resid[3])
#distribution(ds.a[,4]) <- normal(mu[,4],resid[4])
#distribution(ds.a[,5]) <- normal(mu[,5],resid[5])

gretaMod <- model(lambda,nu,resid,theta)

gretaOptOut <- greta::opt(gretaMod,optimiser=gradient_descent(),max_iterations = 2000)
#> Error in py_call_impl(callable, dots$args, dots$keywords): InvalidArgumentError: Input matrix is not invertible.
#>   [[node MultivariateNormalTriL_1/log_prob/affine_linear_operator/inverse/LinearOperatorLowerTriangular/solve/LinearOperatorLowerTriangular/solve/MatrixTriangularSolve/MatrixTriangularSolve (defined at /tensorflow_probability/python/bijectors/affine_linear_operator.py:160) ]]
#> 
#> Original stack trace for 'MultivariateNormalTriL_1/log_prob/affine_linear_operator/inverse/LinearOperatorLowerTriangular/solve/LinearOperatorLowerTriangular/solve/MatrixTriangularSolve/MatrixTriangularSolve':
#>   File "/tensorflow_probability/python/distributions/distribution.py", line 866, in log_prob
#>     return self._call_log_prob(value, name, **kwargs)
#>   File "/tensorflow_probability/python/distributions/distribution.py", line 848, in _call_log_prob
#>     return self._log_prob(value, **kwargs)
#>   File "/tensorflow_probability/python/internal/distribution_util.py", line 2094, in _fn
#>     return fn(*args, **kwargs)
#>   File "/tensorflow_probability/python/distributions/mvn_linear_operator.py", line 210, in _log_prob
#>     return super(MultivariateNormalLinearOperator, self)._log_prob(x)
#>   File "/tensorflow_probability/python/distributions/transformed_distribution.py", line 401, in _log_prob
#>     x = self.bijector.inverse(y, **bijector_kwargs)
#>   File "/tensorflow_probability/python/bijectors/bijector.py", line 977, in inverse
#>     return self._call_inverse(y, name, **kwargs)
#>   File "/tensorflow_probability/python/bijectors/bijector.py", line 949, in _call_inverse
#>     mapping = mapping.merge(x=self._inverse(y, **kwargs))
#>   File "/tensorflow_probability/python/bijectors/affine_linear_operator.py", line 160, in _inverse
#>     x = self.scale.solvevec(x, adjoint=self.adjoint)
#>   File "/tensorflow/python/ops/linalg/linear_operator.py", line 866, in solvevec
#>     return self._solvevec(rhs, adjoint=adjoint)
#>   File "/tensorflow/python/ops/linalg/linear_operator.py", line 816, in _solvevec
#>     solution_mat = self.solve(rhs_mat, adjoint=adjoint)
#>   File "/tensorflow/python/ops/linalg/linear_operator.py", line 811, in solve
#>     return self._solve(rhs, adjoint=adjoint, adjoint_arg=adjoint_arg)
#>   File "/tensorflow/python/ops/linalg/linear_operator_lower_triangular.py", line 207, in _solve
#>     self._tril, rhs, lower=True, adjoint=adjoint)
#>   File "/tensorflow/python/ops/linalg/linear_operator_util.py", line 290, in matrix_triangular_solve_with_broadcast
#>     adjoint=adjoint and still_need_to_transpose)
#>   File "/tensorflow/python/ops/gen_linalg_ops.py", line 1878, in matrix_triangular_solve
#>     adjoint=adjoint, name=name)
#>   File "/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
#>     op_def=op_def)
#>   File "/tensorflow/python/util/deprecation.py", line 507, in new_func
#>     return func(*args, **kwargs)
#>   File "/tensorflow/python/framework/ops.py", line 3616, in create_op
#>     op_def=op_def)
#>   File "/tensorflow/python/framework/ops.py", line 2005, in __init__
#>     self._traceback = tf_stack.extract_stack()
#> 
#> 
#> Detailed traceback:
#>   File "/Users/njtierney/Library/r-miniconda/envs/greta-env/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 950, in run
#>     run_metadata_ptr)
#>   File "/Users/njtierney/Library/r-miniconda/envs/greta-env/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1173, in _run
#>     feed_dict_tensor, options, run_metadata)
#>   File "/Users/njtierney/Library/r-miniconda/envs/greta-env/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
#>     run_metadata)
#>   File "/Users/njtierney/Library/r-miniconda/envs/greta-env/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
#>     raise type(e)(node_def, op, message)

Created on 2021-07-02 by the reprex package (v2.0.0)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os macOS Big Sur 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Perth #> date 2021-07-02 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> abind 1.4-5 2016-07-21 [1] CRAN (R 4.1.0) #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.1.0) #> callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) #> cli 2.5.0.9000 2021-06-14 [1] Github (r-lib/cli@571fea6) #> coda 0.19-4 2020-09-30 [1] CRAN (R 4.1.0) #> codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> future 1.21.0 2020-12-10 [1] CRAN (R 4.1.0) #> globals 0.14.0 2020-11-22 [1] CRAN (R 4.1.0) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) #> greta * 0.3.1.9012 2021-06-05 [1] local #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) #> knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) #> lattice 0.20-44 2021-05-02 [1] CRAN (R 4.1.0) #> lavaan * 0.6-9 2021-06-27 [1] CRAN (R 4.1.0) #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) #> listenv 0.8.0 2019-12-05 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> Matrix 1.3-4 2021-06-01 [1] CRAN (R 4.1.0) #> mnormt 2.0.2 2020-09-01 [1] CRAN (R 4.1.0) #> parallelly 1.25.0 2021-04-30 [1] CRAN (R 4.1.0) #> pbivnorm 0.6.0 2015-01-23 [1] CRAN (R 4.1.0) #> pillar 1.6.1 2021-05-16 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) #> processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.1.0) #> ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.1.0) #> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.1.0) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0) #> reticulate 1.20 2021-05-03 [1] CRAN (R 4.1.0) #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) #> rmarkdown 2.8 2021-05-07 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) #> stringi 1.6.2 2021-05-17 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.4.1 2021-03-30 [1] CRAN (R 4.1.0) #> tensorflow 2.4.0 2021-03-23 [1] CRAN (R 4.1.0) #> tfruns 1.5.0 2021-02-26 [1] CRAN (R 4.1.0) #> tibble 3.1.2 2021-05-16 [1] CRAN (R 4.1.0) #> tmvnsim 1.0-2 2016-12-15 [1] CRAN (R 4.1.0) #> utf8 1.2.1 2021-03-12 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> whisker 0.4 2019-08-28 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.23 2021-05-15 [1] CRAN (R 4.1.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ```