PLN-team / PLNmodels

A collection of Poisson lognormal models for multivariate count data analysis
https://pln-team.github.io/PLNmodels
GNU General Public License v3.0
54 stars 18 forks source link

matrix is singular or not positive definite #98

Closed mjeanbille closed 1 year ago

mjeanbille commented 1 year ago

Hi,

When running PLNnetwork with the last package version 1.0.1, I get the following error :

Erreur dans (function (data, params, config) : inv_sympd(): matrix is singular or not positive definite

This happens when subsetting the dataset, with or without an offset term, and whatever the offset method. When the dataset is larger and not subsetted, the error doesn't happen when the offset term is off (ie. ~0+covariate), but happen when the offset is added to the model (~0+covariate+offset(log(Offset))).

I've tried with another dataset, for which PLNnetwork gave no error with a previous version of the package (offset on or off), and I had the error again.

Thanks in advance for your time and let me know if you need anything else.

Mathilde

sessionInfo() R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] future_1.32.0 PLNmodels_1.0.1 phyloseq_1.37.0

loaded via a namespace (and not attached): [1] colorspace_2.0-3 ggsignif_0.6.3
[3] seqinr_4.2-16 ellipsis_0.3.2
[5] rio_0.5.27 glassoFast_1.0
[7] XVector_0.33.0 dichromat_2.0-0
[9] rstudioapi_0.13 ggpubr_0.4.0
[11] listenv_0.8.0 Deriv_4.1.3
[13] bit64_4.0.5 fansi_0.5.0
[15] codetools_0.2-18 splines_4.2.1
[17] doBy_4.6.11 knitr_1.34
[19] ade4_1.7-17 jsonlite_1.7.2
[21] nloptr_2.0.3 broom_0.7.9
[23] cluster_2.1.4 curry_0.1.1
[25] mapproj_1.2.7 compiler_4.2.1
[27] backports_1.2.1 Matrix_1.5-1
[29] cli_3.6.1 tools_4.2.1
[31] igraph_1.4.2 gtable_0.3.0
[33] glue_1.6.2 GenomeInfoDbData_1.2.6 [35] reshape2_1.4.4 dplyr_1.1.2
[37] maps_3.3.0 phylotools_0.2.4
[39] Rcpp_1.0.8.3 carData_3.0-4
[41] Biobase_2.53.0 cellranger_1.1.0
[43] coro_1.0.3 vctrs_0.6.2
[45] Biostrings_2.61.2 rhdf5filters_1.5.0
[47] multtest_2.49.0 ape_5.6-2
[49] nlme_3.1-159 iterators_1.0.14
[51] xfun_0.26 stringr_1.5.0
[53] ps_1.6.0 globals_0.16.2
[55] openxlsx_4.2.4 lifecycle_1.0.3
[57] rstatix_0.7.0 zlibbioc_1.39.0
[59] MASS_7.3-58 scales_1.1.1
[61] microbenchmark_1.4-7 hms_1.1.0
[63] parallel_4.2.1 biomformat_1.21.0
[65] rhdf5_2.37.3 rematch2_2.1.2
[67] torch_0.10.0 curl_4.3.2
[69] gridExtra_2.3 ggplot2_3.3.5
[71] UpSetR_1.4.0 stringi_1.7.4
[73] paletteer_1.4.0 S4Vectors_0.31.3
[75] corrplot_0.92 foreach_1.5.2
[77] permute_0.9-5 BiocGenerics_0.39.2
[79] zip_2.2.0 GenomeInfoDb_1.29.8
[81] EcolUtils_0.1 pals_1.7
[83] rlang_1.1.1 pkgconfig_2.0.3
[85] bitops_1.0-7 evaluate_0.14
[87] lattice_0.20-45 purrr_1.0.1
[89] Rhdf5lib_1.15.2 processx_3.5.2
[91] bit_4.0.4 tidyselect_1.2.0
[93] parallelly_1.35.0 plyr_1.8.6
[95] magrittr_2.0.1 R6_2.5.1
[97] IRanges_2.27.2 generics_0.1.0
[99] pillar_1.9.0 haven_2.4.3
[101] foreign_0.8-82 mgcv_1.8-40
[103] survival_3.2-13 abind_1.4-5
[105] RCurl_1.98-1.4 future.apply_1.11.0
[107] tibble_3.2.1 crayon_1.5.0
[109] car_3.0-11 utf8_1.2.2
[111] grid_4.2.1 readxl_1.3.1
[113] data.table_1.14.0 callr_3.7.0
[115] vegan_2.5-7 forcats_0.5.1
[117] digest_0.6.29 tidyr_1.3.0
[119] stats4_4.2.1 munsell_0.5.0
[121] usedist_0.4.0

mahendra-mariadassou commented 1 year ago

Hi,

I've tried with another dataset, for which PLNnetwork gave no error with a previous version of the package (offset on or off), and I had the error again.

Just to confirm, the errors occurs on the other dataset both both with and without errors?

This error occurs sometimes when inv_sympd() fails to inverse the precision matrix at some point during the iterative optimization. From experience, it occurs during the initialization step. One possible fix (assuming you have torch installed on your computer) is to bypass the automatic initialization by fitting an initial inception model with the torch backend and passing it as starting point to PLNnetwork().

## Manually fit initial model using torch backend
inception_model <- PLN(Abundance ~ 0 + covariate + offset(log(Offset)),
                       control = PLN_param(trace = 0, backend = "torch"),
                       data = data_PLN) 
## Use the inception as a starting point for PLNnetwork
network_models <- PLNnetwork(Abundance ~ 0 + covariate + offset(log(Offset)),
                             control = PLNnetwork_param(trace = 0, inception = inception_model),
                             data = data_PLN)

There was a small bug that we fixed in PLNnetwork(). For the above chunck to work, you need to either install the dev version

remotes::install_github("pln-team/PLNmodels", ref = "dev")

or wait for the next CRAN release which includes the fix (@jchiquet is currently working on it).

jchiquet commented 1 year ago

On this topic, I also changed the value for initializing the variance of the variational parameters, which caused an error for some tests (1 -> 0.1). This seems to correct the problem on my side. Hopefully it will also correct it on yours !

This is now in the master/main branch, from version 1.0.1-9020, thank you in advance for trying.

mahendra-mariadassou commented 1 year ago

If Julien's fix works, it's definitely easier than mine ! And both are available in version 1.0.1-9020

mjeanbille commented 1 year ago

Julien's fix is working, thanks a lot !