constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
255 stars 34 forks source link

Can we compute soup components and adjustCounts separately? #31

Closed JPingLin closed 4 years ago

JPingLin commented 4 years ago

Thank you for the great tool! I have several questions:

  1. If I decide to setContaminationFraction manually, do I still need to pull the raw_feature_bc_matrix?

  2. If I decide to only plotMarkerDistribution for each sample but not correct the data, does it matter if I use filtered_feature_bc_matrix directly from 10X or it will change the result if load a processed matrix? I have filtered, merged several channels, removed doublets, and attach sample names with the barcode. I wonder will the SoupX need two matrices with matching barcode to recognize each other?

Thanks!

constantAmateur commented 4 years ago
  1. Yes. The raw_feature_bc_matrix is used to calculate what the contamination profile looks like. i.e., given 1 molecule of contamination, what is its probability of being geneA, geneB, etc. The contamination fraction is specifying how much contamination there is; given 100 observed counts, how many do you expect to be contamination.

  2. It shouldn't matter if you use a processed matrix instead of the filtered_feature_bc_matrix. As long as the processing you have done hasn't normalised the data. That is, they should still be counts.