yancychy / DiffChIPL

Apache License 2.0
4 stars 0 forks source link

Can DiffChIPL produce the normalized log2CPM matrix without experimental noise and batch effects #3

Open xflicsu opened 1 year ago

xflicsu commented 1 year ago

Hi, I wonder whether DiffChIPL can produce the normalized log2CPM matrix through 2-7 step (from CPM normaliztion to differential analysis by limma) ? And how to generate the normalize matrix with experimental noise and batch effects removing?

yancychy commented 1 year ago

Hi, We used getReadCount() with removeBackground=TRUE to remove the background noise. We used cpmNorm() to cacluate the log2CPM values.

cpmNorm <- function(rawcount, libsize=NULL){
  cX = rawcount
  for(i in 1:ncol(cX)){
    if(is.null(libsize)){
      cX[,i] = log2(10^6 * cX[,i] / sum(cX[,i]) +1)
    }else{
      cX[,i] = log2(10^6 * cX[,i] / libsize[i] + 1)
    }
  }
  cX
}

If you want to remove the batch effects, you can check the DiffChiPL() function. You can check the step1-3.

DiffChIPL <- function(cpmD, design0, group0 = group ){
  #1.limma  linear regression
  fit1 <- lmFit(cpmD, design0, weights=NULL, method = "ls")
  #2.Fit the residual varaince for low read counts
  fitRlimmN = fit1
  filtL = filtLowCounts(cpmD, design0,
                         sx = fitRlimmN$Amean, sy = fitRlimmN$sigma )
  fitRlimmN$sigma[filtL$fid] =  filtL$sigma[filtL$fid]
  #3.Remove bias by LOESS regression
  resV = loessNormOffSet(d0 = cpmD, group= design0[,2], smean=fitRlimmN$Amean, 
                      sfold= fitRlimmN$coefficients[,2], offSets = T)
  fitRlimmN$coefficients[,2] = resV$dnormV
  #4.limma-trend
  fitRlimmR = fitRlimmN
  fitRlimmR <- eBayes(fitRlimmR, trend = TRUE, robust=TRUE)
  rtRlimmR = topTable(fitRlimmR, n = nrow(cpmD), coef=2)
  rtRlimmR = rtRlimmR[rownames(cpmD),]
  list(fitlimma = fit1, fitDiffL = fitRlimmR,  resDE = rtRlimmR)
}
xflicsu commented 1 year ago

Thanks for your quick response! I will try to figer out this.

xflicsu commented 1 year ago

Hello, I find you normalze the mean value with LOESS. But, how to get the normalized matrix of each sample? Further, when I have three more groups, how to get the matrix of each sample without noise and batch effects?

yancychy commented 1 year ago

Sorry. Currently, we did not provide the method to recover the normalized matrix of each sample. I suggest you can run MAnorm2, which provides the solution to get the normalized matrix by scaling.

xflicsu commented 1 year ago

Sorry. Currently, we did not provide the method to recover the normalized matrix of each sample. I suggest you can run MAnorm2, which provides the solution to get the normalized matrix by scaling.

Thanks for your suggestion.