hmorlon / PANDA

Phylogenetic ANalyses of DiversificAtion
24 stars 15 forks source link

Error in while (sum(abs(OV)) > epsnormv & end < 1000) { : missing value where TRUE/FALSE needed #49

Open madzafv opened 1 year ago

madzafv commented 1 year ago

Hi, I'm running

library(Rcpp)
library(ape)
library(terra)
library(raster)
library(RPANDA)

phy <- read.tree('.../treeDim.tre')

# fit ClaDS with a proportion of 153/350 as sampling fraction (153 spp. in the tree vs ~350 total species)
setwd('...')
sample_fraction <- 153/350

# Option 2: run by 25k iterations each time
# first run
sampler <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler25k',model_id="ClaDS2",nCPU = 3)

# start second run using the result from first run
sampler2 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler50k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler25k)

# start third run
sampler3 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler75k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler50k)

# next run
sampler4 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler100k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler75k)

# next run
sampler5 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler125k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler100k)

# next run
sampler6 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler150k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler125k)

# next run
sampler7 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler175k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler150k)

# next run
sampler8 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler200k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler175k)

# next run
sampler9 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler225k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler200k)

# next run
sampler10 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler250k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler225k)

# next run
sampler11 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler275k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler250k)

# next run
sampler12 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler300k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler275k)

And got a series of there errors and warnings along the way:

Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In daspk(y, times, func, parms, ...) :
  repeated convergence test failures on a step - inaccurate Jacobian or preconditioner?
2: In daspk(y, times, func, parms, ...) :
  Returning early. Results are accurate, as far as they go
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed

Is this warning something I should worry about? Thank you, -madza

hmorlon commented 1 year ago

Hello Mazda,

The warning is fine but not the Error. Seems like it cannot compute sum(abs(OV))), but i can't tell you why without looking into here in more depth. There are probably NAs in OV due to some steps of the MCMC that cannot be computed.

I don't remember if i suggested before that you use the Julia version of ClaDS rather the R version? The one describe in Maliet & Morlon Syst Bio 2022 "Fast and Accurate Estimation of Species-Specific Diversification Rates Using Data Augmentation". It is faster (+ has some added functionalities) and may not have these issues? https://hmorlon.github.io/PANDA.jl/dev/

I'd suggest you try this instead, and let us know if this does not avoid the issues you encoutered with the R version?

Good luck!

Hélène

hmorlon commented 1 year ago

Hi again, Sophia Lambert (former PhD student in the lab) suggests that you try to reduce the thin-in, unless this takes too much space? But she also says that she does the same type of short-time iterations in the Julia version of ClaDS and that she does not have any issue. Hope this helps! Best Hélène