zdk123 / SpiecEasi

Sparse InversE Covariance estimation for Ecological Association and Statistical Inference
GNU General Public License v3.0
196 stars 68 forks source link

qqdplot_comm() function does not provide expected results #233

Open SarithaKodikara opened 1 year ago

SarithaKodikara commented 1 year ago

Hi Zachary,

I have noticed that the results generated by the qqdplot_comm() function are not as expected. When I use the function to plot data generated from a zero-inflated negative binomial distribution (zinegbin), the resulting QQ plot does not appear to fit the expected zinegbin distribution.

To provide more context, I have included an example below.

library(SpiecEasi)
data(amgut1.filt)
depths <- rowSums(amgut1.filt)
amgut1.filt.n  <- t(apply(amgut1.filt, 1, norm_to_total))
amgut1.filt.cs <- round(amgut1.filt.n * min(depths))

d <- ncol(amgut1.filt.cs)
n <- nrow(amgut1.filt.cs)
e <- d

set.seed(10010)
graph <- make_graph('cluster', d, e)
Prec  <- graph2prec(graph)
Cor   <- cov2cor(prec2cov(Prec))

X <- synth_comm_from_counts(amgut1.filt.cs, mar=2, distr='zinegbin', Sigma=Cor, n=n)
qqdplot_comm(data.frame(X), 'zinegbin', plot=TRUE , main="QQ plot")

image

It might be due to the fact that the function is fitting distributions on rows rather than columns. However, in synth_comm_from_counts the samples were in rows and OTUs were in columns.

qqdplot_comm <- function(comm, distr, param, plot=TRUE, ...) {
  if (class(comm) != 'qqdcomm') {
    if (!missing(param)) {
      fit <- t(sapply( **1:nrow(comm)**, function(i) qqdplot(comm[i,], distr=distr, param=param[[i]], plot=FALSE)))
    } else {
      fit <- t(apply(comm, **1**, qqdplot, distr=distr, param=NULL, plot=FALSE))
    }
    x   <- as.vector(fit)
    y   <- as.vector(as.matrix(comm))
  } else {
    x <- comm$x
    y <- comm$y
  }

  if (plot) {
    xrange <- range(y)
    if (!('main' %in% names(list(...)))) main = 'QQ-plot'
    else main <- list(...)$main
    plot(y, x, main=main, ylim=xrange, xlab='Observed Quantiles', ylab='Theoretical Quantiles')
    abline(0, 1, lty=2)
    legend('topleft', paste('r^2 =', format(summary(lm(x ~ y))$adj.r.squared, digits=4), sep=""), box.lwd=0)
  } else {
    return(structure(list(x=x, y=y), class="qqdcomm"))
  }
}

Regards Saritha