Suggestion to explore scaling of gene counts to further reduce PCR amplification biases from information about number of molecule duplicates per gene. #44
I am currently looking into the variation that can be expected between profiles of the same cell type, produced at different labs, and I believe that PCR amplification biases come into play here, even though heavily reduced by using UMIs the way they are used now. I get the feeling that UMI duplicate reads are discarded and not used further. I imagine that if you have fewer copies of the molecules for a gene, that gene is likely less amplified, and such counts should be scaled up since the probability of finding the molecules of that gene is lower, at least if the number of reads are below UMI saturation. I’m just curious if this has been tried, and if not, if you are willing to? I could contribute with some calculations if needed.
I am currently looking into the variation that can be expected between profiles of the same cell type, produced at different labs, and I believe that PCR amplification biases come into play here, even though heavily reduced by using UMIs the way they are used now. I get the feeling that UMI duplicate reads are discarded and not used further. I imagine that if you have fewer copies of the molecules for a gene, that gene is likely less amplified, and such counts should be scaled up since the probability of finding the molecules of that gene is lower, at least if the number of reads are below UMI saturation. I’m just curious if this has been tried, and if not, if you are willing to? I could contribute with some calculations if needed.