Question on warning for Newton steps

huiqingyeooo commented 2 years ago

Dear Dr. Pope,

I encountered a warning message when I tried to run radish with my dataset: Warning message: In BoxConstrainedNewton(phi$phi, function(par, gradient, hessian) g(E = E, : maxit reached for Newton steps.

I am testing with one raster layer (continuous leaf area index), and I have scaled and created a parameterised conductance surface. The raster is not very big 191, 302, 57682 (nrow, ncol, ncell), and the pixel size is 200, covering a geographical area of ~700km². I have a large snp dataset of ~35,000 snps (105 individuals), and I calculated genetic dissimilarity matrix reflecting the number of absolute allelic differences in pairwise comparisons. Several individuals are aggregated in the same pixel, but overall the majority of the points are quite spread out.

I have tried with mlpe as the measurement model, but did not encounter any warning messages, and the run finishes in 6 steps. The warning message only showed up when I specified generalized_wishart as the measurement, but I can still view the results using summary(). Would this mean that the model is unable to converge for the wishart measurement, hence I can’t trust the results? Is there a way to increase the number of Newton steps, and is that advisable?

Thank you very much for your time and help!

Best wishes, Huiqing PhD student National University of Singapore

nspope commented 2 years ago

Hi Huiqing, sorry for the slow response. The short answer is, you can change the number of Newton-Raphson steps by doing something like the following:

radish(..., control=radish::NewtonRaphsonControl(maxit=1000))

see ?NewtonRaphsonControl. I'd also set verbose=TRUE in the call to radish so you get a trace of the loglikelihood as the optimizer does its thing.

The longer answer is that if the optimizer is not converging within 100 iterations (the default) for a simple problem, that's suspicious. It could be one of:

An implementation issue with the generalized Wishart loss function -- I haven't tested this thoroughly. This loss function also doesn't use resistance distances per se, so it behaves a little differently from MLPE or simple linear regression.
Numerical issues with the resistance distance calculation (underflow/overflow) -- several people have reported issues with optimization that stem from numerical underflow, although this seems to happen mainly with large rasters. And, optimization should outright fail if this happens.
A likelihood surface that is ill-behaved -- for example that has regions of parameter space where the resistance distances between samples don't change at all (this would manifest as a flat region on the likelihood surface); or where resistance distances change very rapidly (this would manifest as a discontinuity). Both of these can mess up Newton and quasi-Newton optimization methods.

In general, whatever loss function you are using, I think it is wise to use ?radish_grid to calculate log-likelihoods across a grid of parameter values around the maximum likelihood estimates. Then you can visualize to see if the likelihood surface is approximately quadratic around the MLE (in which case optimization should work well) or if it has discontinuities or other nasty features. If you have lots of parameters, then you can calculate likelihood profiles (e.g. vary one parameter while fixing the others at the MLE) rather than evaluating across a full grid. It's also useful to take the resistance distances across parameter values and plot against your genetic distances (so you can visually assess the fit) -- you can get these via ?radish_distance.

Hopefully that's a clear enough explanation that you can make some visualizations/troubleshoot your data? If not, feel free to follow up with more questions.

huiqingyeooo commented 2 years ago

Dear Dr. Pope,

Thank you very much for the detailed response! I am still in the midst of optimising and will follow up if I have more questions.

nspope / radish

Question on warning for Newton steps #3