Closed mesnger closed 2 months ago
Hi Jaeyong,
Thank you for your interest in our method. Yes, you are right, we don’t use gene length to normalize expression values in our single-cell pipeline due to the data sparsity. The likelihood of multiple random hexamer primers producing reads from the same RNA molecule is low, as there's often minimal signal per gene in each cell, usually just one read. Please let me know if you have any further questions.
Best wishes, Andras
Hello, I am Jaeyong, and thank you for providing the great library and analysis method.
I was wondering if there is any normalization method expression counts derived for random primers. In theory, multiple reads with different UMI may originate from a single RNA if several random primers bind to a single RNA. And if this is the case, the conventional bulk RNA normalization method which utilizes gene length to normalize expression (such as FPKM or TPM) could be applied.
I have read the gene level and post processing analysis script but could not find any normalization step using with gene length. Could it be simply due the the low content of overall RNA reads deriving from random primer, that the normalization by length is unnecessary?
Any reply would be helpful. Cheers,
Jaeyong