Closed Ataliai closed 1 month ago
Why would you want to run fastuniq? what is the biological experiment you are analyzing and what will you benefit from deduplication?
Hi, I join the question. We thought a duplicate should be removed based on this slide in presentation 8.
Ok, I understand. It is a valid confusion probably caused by me; I'll make sure to remove it in future courses.
What I tried to say is that deduplication can be used in RNA Seq if there is a sufficient QC information that encourage us to do it. In RNA Seq we ultimately aim to quantify the expression profiles of transcripts, as such we expect some level of duplication in the data. If our QC returns huge duplication values then we should consider deduplicate it. On the data you downloaded there is no need to do that
I understand.. We will repet the analysis again, because we have already removed the duplicates. Thanks for the quick reply!!
Is it possible to continue the analysis with the data after using fastuniq?
It is possible, but you need to explain the rational behind doing so. What part of the data or understanding of the biological context of the experiment set you on this rational?!
OK. Thanks for your answer
According to the files I received in the project (GSE229677) I should run fastuniq for PE. Each such file is huge (11 GB) and it seems that I don't have enough RAM on my computer for this purpose (there is a log of out of memory). Is there another alternative?