There may be cases where the observed test statistic is not able to be calculated (i.e. returns NA) because the model is so poor (for example, fitting lm to the abs(contrasts) ~ node height chokes on some datasets).
Currently, this results in a p-value than goes towards 2 as more sims are added
MWE
library(arbutus)
library(diversitree)
t <- tree.bd(pars=c(1,0), max.taxa=100)
d <- sim.character(t, 1)
## build unit tree
uu <- make_unit_tree(t, data=d)
## calculate a simple test statistic (doesn't matter what it is)
foo <- function(x) mean(x$pics[,"contrasts"])
## Calculate the test statistics
stat_obs <- calculate_pic_stat(uu, stats=list(mean=foo))
## Simulate new datasets
sims <- simulate_char_unit(uu, nsim=1000)
## Calculate test statistic on simulated data
stat_sim <- calculate_pic_stat(sims, stats=list(mean=foo))
## Compare the two
compare_pic_stat(stat_obs, stat_sim)
## Everything works fine
## Make the value of the test stat equal to NA
stat_obs[1] <- NA
## Now compare
compare_pic_stat(stat_obs, stat_sim)
if the test statistic cannot even be calculated on the observed data, the data is clearly outside of the bounds of what we expect if our model was adequate. I.e., the p-value should automatically be 0 when this is the case
What is the safest way to handle this case in compare_pic_stat
There may be cases where the observed test statistic is not able to be calculated (i.e. returns NA) because the model is so poor (for example, fitting lm to the abs(contrasts) ~ node height chokes on some datasets).
Currently, this results in a p-value than goes towards 2 as more sims are added
MWE
if the test statistic cannot even be calculated on the observed data, the data is clearly outside of the bounds of what we expect if our model was adequate. I.e., the p-value should automatically be 0 when this is the case
What is the safest way to handle this case in compare_pic_stat