DistanceDevelopment / Distance

Simple distance sampling analysis
GNU General Public License v3.0
9 stars 8 forks source link

`summarize_ds_models()` seems to do nothing as soon as one of the considered models has numerical problems #148

Open wlangera opened 1 year ago

wlangera commented 1 year ago

I have run multiple distance models in for loops iterating between different formulas and key functions. For a few models I get the following message:

Some variance-covariance matrix elements were NA, possible numerical problems; only estimating detection function.

I was not able to reproduce this message with a small reproducible example.

When I put all fitted models in the function summarize_ds_models(), R is running but nothing is outputted. No summary table about fitted detection functions, but also not even a warning or a message ... As soon as I remove the problematic models, the function works as expected.

With a lot of models fitted (for example within a for loop), it is not easy to identify which exact models have numerical problems and which can be used with summarize_ds_models().

Would it be possible to make sure that summarize_ds_models() still outputs a table when only few of the models have numerical problems? These can be skipped and a warning can be placed stating which models were not appropriate to provide summaries (AIC, C-vM p-value, ...).

I believe it would aid more automation in model selection because now I had to manually identify and exclude every problematic model.

erex commented 1 year ago

@wlangera Thanks for reporting your experience with summarise_ds_models when provided with some mdels with convergence problems. It is probably not a bad thing that you need to scrutinise your models before handing them off to summarise_ds_models. This reduces the probability that inference will be made from a rubbish model, e.g. a model where the estimated parameters are not maximum likelihood estimates.

I'm guessing that if you pass the set of model objects to AIC() you will get perhaps a more informative error message. Could you check? Or might the same problem bothering summarise_ds_models also cause AIC() to fail? Either way, we might learn something. If AIC produces something intelligible, perhaps that could be used as an early screening tool with summarise_ds_models

wlangera commented 1 year ago

@erex Thanks for the fast reply. The AIC.dsmodel() function seems te work just fine without any errors, warnings ... I tried several things and noticed nothing happens when I use summary() on my model object. In summarise_ds_models() this is used to extract the average detection probability $\hat{P}_a$ and its standard error. That is why summarise_ds_models() also outputs nothing.

I feel that there should be at least be a (warning) message somewhere. Because now nothing happens when you use summary() and summarise_ds_models() on problematic models, which is not very transparent.

wlangera commented 1 year ago

As far as I understand the problem of some of my models, is that the Hessian could not be calculated. Therefore the summary() function cannot be used and in turn the summarise_ds_models().

I created a workaround for myself in the summarise_ds_models() function:

# ...

  # this function extracts the model data for a single model (row)
  extract_model_data <- function(model){
    if (any(is.na(model$ddf$hessian))) {
      average.p <- NA
      average.p.se <- NA
    } else {
      summ <- summary(model)
      average.p <- summ$ds$average.p
      average.p.se <- summ$ds$average.p.se
    }

    # ...

# ...

I understand this is not very elegant and that it perhaps should not be incorporated in the package. Either way, together with my adjustments provided in #149, it is with these changes possbile to automatically run multiple models in a for loop and perform subsequent model selection even when some models had numerical problems. The latter will be detectable because average.p <- NA and average.p.se <- NA. I guess rubish models can further be identified if you have done data exploration and by checking model fit (e.g. the plot() function).