stefpeschel / NetCoMi

Network construction, analysis, and comparison for microbial compositional data
GNU General Public License v3.0
148 stars 27 forks source link

Error using spring as measure #6

Closed FilipeMatteoli closed 3 years ago

FilipeMatteoli commented 3 years ago

This error occurred with my data so I tried using the tutorial data as well and got the same error.

data("soilrep")
soil_warm <- metagMisc::phyloseq_sep_variable(soilrep, "warmed")
net_seas_p <- netConstruct(soil_warm$yes, soil_warm$no,
                            filtTax = "highestVar",
                            filtTaxPar = list(highestVar = 500),
                            verbose = 3, measure = "spring", 
                            zeroMethod = "pseudo",
                            normMethod = "clr")

Infos regarding changed arguments: Zero handling included in 'spring'. Normalization ignored for measure 'spring'. Sparsification included in 'spring'.

Data filtering ... Intersection of taxa selected. 0 samples removed in data set 1. 0 samples removed in data set 2. 7812 taxa removed in each data set. 229 taxa and 28 samples remaining in data set 1. 229 taxa and 28 samples remaining in data set 2.

Calculate 'spring' associations ... The input is identified as the covariance matrix. Conducting Meinshausen & Buhlmann graph estimation (mb)....done The input is identified as the covariance matrix. Conducting Meinshausen & Buhlmann graph estimation (mb)....done The input is identified as the covariance matrix. Conducting Meinshausen & Buhlmann graph estimation (mb)....done The input is identified as the covariance matrix. Conducting Meinshausen & Buhlmann graph estimation (mb)....done The input is identified as the covariance matrix. Conducting Meinshausen & Buhlmann graph estimation (mb)....done Error in mixedCCA::estimateR(data, type = type, method = Rmethod, tol = tol, : There are variables in the data that have only zeros. Filter those variables before continuing. In addition: Warning message: In Matrix::nearPD(R, corr = TRUE) : 'nearPD()' did not converge in 100 iterations

Also, I would like to know how to obtain the number of nodes, edges from the network generated by netConstruct and number of clusters from netAnalyze output.

stefpeschel commented 3 years ago

Hello,

please excuse my late answer. I couldn't find a solution for this issue, so I finally asked Grace Yoon, the SPRING package owner, for help.

The issue is caused by taxa with an extremely small proportion of non-zero values. In SPRING, the data are subsampled and the rank-based correlations are computed for each subsample. After subsampling (80% by default) there are some taxa whose total sum of counts is zero. Up to now, they caused an error in mixedCCA (the package for rank-based correlation estimation called by SPRING).

Fortunately, the mixedCCA package has been updated recently so that these rare taxa don't cause an error anymore. So, please update mixedCCA to the latest version in order to avoid the error.

Nevertheless, we would recommend to remove extremely rare taxa from the data, because they are not quite meaningful.

Best regards Stefanie