dwarton / mvabund

mvabund updates
10 stars 14 forks source link

unexpected output during re-sampling with method="pit-trap" and offset #68

Closed dleopold closed 5 years ago

dleopold commented 5 years ago

Hello - I am getting many lines of output printed to the console when running anova.manyglm on a model fit as follows: mv.full <- manyglm(mvDat ~ Treatment*Genotype, offset=effort, family="negative_binomial") Where effort = log(rowSums(mvDat)) When I run a test with: anova(mv.full, nBoot=199, cor.type="R", test="wald") The expected test results are returned, but only after printing many lines to the console that look like this: l=nan, theta=1000000.0000, yi=44.0000, mu=-nan l=nan, theta=1000000.0000, yi=24.0000, mu=-nan l=nan, theta=1000000.0000, yi=43.0000, mu=-nan l=nan, theta=1000000.0000, yi=41.0000, mu=-nan l=nan, theta=1000000.0000, yi=45.0000, mu=-nan l=nan, theta=1000000.0000, yi=8.0000, mu=-nan There is one line for every sample*nBoot and occasionally yi=inf. This output does not appear if I remove the offset or use to a different resampling technique. Plotting the model diagnostics does not reveal any obvious problems and I get nearly identical statistics setting resamp = "perm.resid". However, this behavior does not occur when I add a similar offset to any of the example data provided with the package, so I am not sure how to provide a simple reproducible example. I am currently using the newest mvabund v4.1.1. Could this be some type of minor bug? Or perhaps my model is set up incorrectly in some way that is not obvious to me?

dwarton commented 5 years ago

sorry for slowness on this - yes I have seen this before, there is a bug in error handling here, what is happening is that the model is not converging and has led to predicted values that are undefined. The times I have seen this error before have been when the connection between model and data was pretty bad, e.g. wrong family was used, densities analysed as if counts, undefined data values (are any of your row sums zero? Your effort variable would then have some infinite values in it). I can't say more without seeing the data, happy to have a look if you e-mail it through, with minimal code reproducing the error.

dleopold commented 5 years ago

David - Thank you for following up. I tracked the issue down to a single observation of one species that was obviously erroneous (a few orders of magnitude too big). After dropping the problematic sample the model appears to converge and no errors are thrown. It would be helpful to have a more informative error message in this situation, but I will mark the issue as closed since this was clearly a data quality control issue on my part.