JenniNiku / gllvm

Generalized Linear Latent Variable Models
https://jenniniku.github.io/gllvm/
48 stars 20 forks source link

General updates, and for ordination with predictors #62

Closed BertvanderVeen closed 2 years ago

BertvanderVeen commented 2 years ago
BertvanderVeen commented 2 years ago

I will add updates to this branch soon: new identifiability constraints for the fixed-effects slopes and some other changes. As an aside: currently, the prediction intervals for the site scores when slopes are treated as random-effects are constructed in ordiplot().

This is a bit ad-hoc and messy, and should in the future be moved to CMSEPf(.) and sdrandom(.) for cleaner and more consistent code. The current way of coding makes things difficult and messy when both num.RR or num.lv.c and num.lv are combined in a single model (even though I am not sure why someone would want to do that).

BertvanderVeen commented 2 years ago
BertvanderVeen commented 2 years ago

-removed comment-

BertvanderVeen commented 2 years ago

I have now removed the heuristic penalty that was used to induce orthogonality for ordination with predictors. This is instead replaced by alternative optimization routines when num.RR>0 and when num.lv.c>0. Some of these new optimization routines could also be used for unconstrained optimization in other parts of the package, but I have only limitedly explored that. Generally, the BFGS algorithm of nloptr seems fussy w.r.t. constraints (it doesn't seem to allow Inf as bounds for the parameters) so IMO base optim with BFGS is preferred, though there are other unconstrained optimization routines in nloptr that might function better (which I have not explored).

Options for the optimization in ordination with predictors are 1) 'nloptr(agl)'(default), 2) 'nloptr(sqp)', or 3) 'alabama'. 1) and 2) use the nloptr library to implement constrained optimization with equality constraints using an augmented Lagrangian method (1) or using a sequential quadratic programming algorithm (2). 1) requires specifying a local solver which can be specified through optim.method and is by default NLOPT_LD_TNEWTON_PRECON but a few alternatives are supported. Alternatively, 3) provides an alternative augmented Lagranian algorithm where R base's optim or nlminb are used as local solvers (also specified through optim.method). These alternatives are important because all three algorithms have pros and cons and can all be a bit fussy about convergence. Generally, 1) seems reasonably robust (unless a model is overparameterized or fits very poorly), but 2) usually seems to provide better convergence of the constraints. 3) seems more robust to issues with starting values, and will run regardless of the quality of the starting values, whereas 1) and 2) will simply not start the optimization algorithm at all. All three the algorithms seem to do well with starting.val = "zero".

Auxiliary functions for the implementation of the optimizers are stored in gllvm.auxiliary.R.

When the canonical coefficients are treated as random-effects with the randomB argument, optim or nlminb are used as usual. I have built in checks in gllvm(.) to warn the user when the canonical coefficients are not 'sufficiently' orthogonal, where 'sufficiently' here is arbitrarily determined as a tolerance of 1e-2 for the off-diagonals of B'B (i.e. the constraints). If this happens, users should try 1) alternative starting values, 2) increase the convergence tolerances and maximum number of iterations, or 3) change the optimizer.

In total, orthogonality of the canonical coefficients adds d(d-1)/2 constraints (where d is the number of latent variables and d<K<m where K is the number of predictors and m the number of species), so that the number of constraints should now be the same as in e.g. rrvglm() in the VGAM R-package. However, there Thomas Yee has implemented corner constraints on the species loadings (alternatively an SVD-based parameterization), which was difficult to do here, hence the alternative formulation (details will be in the publication). At d = p, this a generalized reduced rank regression has the same number of parameters as a multivariate GLM, but where instead the matrix of predictor slopes is estimated as its QR decomposition. Here, the canonical coefficients are B = Qdiag(diag(R)) and the species loadings are gamma = Rdiag(diag(R))^{-1}, and diag(R) is the column-wise frobenius norm of the canonical coefficients.

With d = p and using num.lv.c the model is a multivariate GLM with p number of constrained and p number of unconstrained LVs. Though num.RR and num.lv.c can be mixed, I caution against doing so, unless for scenarios where num.RR = p (a warning is built in to inform the user combining num.lv with num.RR or num.lv.c might not always be a smart choice.

I do not exclude that these implementations will need/get some bug fixes in the near future.

JenniNiku commented 2 years ago

Hi, I already merged this, but could you make short descriptions/list of new features and main bug fixes that you made in this update to be added in the NEWS file?

JenniNiku commented 2 years ago

So as in NEWS file, please put them under version 1.3.2 like this: Version 1.3.2

New Features

*

Bug Fixes

BertvanderVeen commented 2 years ago

Of course, I will take a moment right now to work it out.