PLN-team / PLNmodels

A collection of Poisson lognormal models for multivariate count data analysis
https://pln-team.github.io/PLNmodels
GNU General Public License v3.0
54 stars 18 forks source link

Bug in `PLNmixture()` when `clusters` is not a complete sequence #95

Closed mahendra-mariadassou closed 1 year ago

mahendra-mariadassou commented 1 year ago

PLNmixture() with backward (and potentially forward) smoothing fails when the clusters argument is not a complete sequence starting at 1.

library(PLNmodels)
#> This is packages 'PLNmodels' version 0.11.7-9720
#> Use future::plan(multicore/multisession) to speed up PLNPCA/PLNmixture/stability_selection.
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
myMixtures <- PLNmixture(Abundance ~ 1 + offset(log(Offset)), clusters = 2:4, data = trichoptera)
#> 
#>  Initialization...
#> 
#>  Adjusting 3 PLN mixture models.
#>  number of cluster = 2   number of cluster = 3   number of cluster = 4 
#> 
#>  Smoothing PLN mixture models.
#>    Going backward +
#> Error in self$models[[k]]: indice hors limites

Created on 2023-02-04 with reprex v2.0.2

The culprit are here and here as k refers both to the number of clusters and the position in the family of models.

mahendra-mariadassou commented 1 year ago

I realized while working on the fix that smoothing implicitly assumes that models in the family are adjacent (e.g. we have all models between 2 clusters and 10 clusters, not only the ones with 2, 5 and 10 clusters). I added a warning and easy fix but it might be useful to smooth with more spread out models. @jchiquet I'll work on a fix in a separate and let you decide whether it's worth including in the package.

jchiquet commented 1 year ago

Could you please an entry in NEWS.md?

thanks!

mahendra-mariadassou commented 1 year ago

Done. And you can have a look at branch mixture_smoothing to see how we can deal with incomplete sequences (e.g. c(1, 3, 5)).

jchiquet commented 1 year ago

This sounds good, and will match what the user has requested. Ok to merge in main/master for me. Do you keep all the model fitted in memory or just use them for the browsing during the slit/merge strategy ?

Thanks!

mahendra-mariadassou commented 1 year ago

I'll merge it. I keep only the requested models in memory, the others are only used during browsing.