MarioniLab / scran

Clone of the Bioconductor repository for the scran package.
https://bioconductor.org/packages/devel/bioc/html/scran.html
40 stars 22 forks source link

scran normalization single cell allele count #62

Closed Hemantcnaik closed 4 years ago

Hemantcnaik commented 4 years ago

Hi, really powerful package! I am currently using data set single cell allelic analysis count files(alleleA and alleleB) wanted to try the package normalizing read counts, which is without spike ins. is it possible do any suggestions tutorial on this can you please help me

LTLA commented 4 years ago

I'm going to guess that this is related to https://support.bioconductor.org/p/132240.

I don't have any particular opinions or advice for using scran to do allele-specific expression analysis. If you must compute allele-specific cell-specific size factors, I would try the following:

  1. Add two allele matrices together and use, e.g., scran::calculateSumFactors() on the summed counts. This can be achieved in the same manner as described in various documentation sources and yields a per-cell size factor.
  2. Take the row sums of each matrix and normalize the summed profiles against each other with edgeR::calcNormFactors. This will give you a per-allele normalization factor. Note: the distinction between the size factor and normalization factor is important, do not get them mixed up.
  3. Obtain per-allele per-cell size factors by multiplying the size factor for each cell with the normalization factor for that allele. This gives you one size factor per cell in each of the two allele matrices.

The multiplication in 3 assumes that any allele-specific biases are constant across all cells, which seems reasonable to me but YMMV.

Hemantcnaik commented 4 years ago

Thanks for your response. I am really not experiance with single cell allele specific analysis. above mentioned steps really confused not able to understand. can you please explain me more with some example or something, is this related to codes mentined in https://support.bioconductor.org/p/132240.

below mentioning my code what I understood from above comment. can you please correct me

This normalization is very much important please suggest me example: alleleA alleleB

norm.factorsA <- calcNormFactors(alleleA, method="TMM") eff.libA <- norm.factorsA * colSums(alleleA+alleleB) eff.libA <- eff.libA/mean(eff.libA) normA <- t(t(alleleA)/eff.libA)

norm.factorsB <- calcNormFactors(alleleB, method="TMM") eff.libB <- norm.factorsB * colSums(alleleA+alleleB) eff.libB <- eff.libB/mean(eff.libB) normB <- t(t(alleleB)/eff.libB)

write.csv(normA,"norm.dataA.csv") write.csv(normB,"norm.dataB.csv")

LTLA commented 4 years ago

I would have thought I was pretty clear.

# Compute the per-cell bias, under the assumption that
# any bias affects both allelic profiles equally.
summed <- alleleA + alleleB
per.cell.sf <- scran::calculateSumFactors(summed)

# Determine if any systematic allelic bias exists.
allA <- rowSums(alleleA)
allB <- rowSums(alleleB)
adj <- calcNormFactors(cbind(allA, allB))

# Adjust for the allelic bias, assuming that it is 
# independent of the per-cell bias.
per.cell.sf.A <- per.cell.sf * adj[1]
per.cell.sf.B <- per.cell.sf * adj[2]
Hemantcnaik commented 4 years ago

@LTLA thanks for your suggestion very helpful