kvittingseerup / IsoformSwitchAnalyzeR

An R package to Identify, Annoatate and Visialize Isoform Switches with Functional Consequences (from RNA-seq data)
96 stars 18 forks source link

Warnning -- remove duplicates #226

Closed Annie133 closed 3 months ago

Annie133 commented 5 months ago

Hi, I am enclosing the warning like this, I use salmon for quantification, but the new salmon version does "t have the function --remove duplicate. Could you please comment on why this warning appeared? That would be very helpful for me to understand the whole process.

Warning message: In importRdata(isoformCountMatrix = salmonQuant.new$counts, isoformRepExpression = salmonQuant.new$abundance, : The annotation and quantification (count/abundance matrix and isoform annotation) Seem to be slightly different. Specifically: 373 isoforms were only found in the annotation

Please make sure this is on purpouse since differences will cause inaccurate quantification and thereby skew all analysis. If you have quantified with Salmon this could be normal since it as default only keep one copy of identical sequnces (can be prevented using the --keepDuplicates option) We strongly encurage you to go back and figure out why this is the case.

Many many thanks!

chunxubioinfor commented 4 months ago

Hi Shuang! This warning indicates that the isoform ID in annotation and quantification are sort of different (jaccardSimilarity >= 0.925), so the removal of replicates may reduce this similarity slightly. Here I found a webpage which explains more about duplicates. Hope this could help you. Just want to make sure, did you use the gtf file the same as the file used in quantification?

Annie133 commented 3 months ago

Hi, thanks for your reply and the info webpage! I use the same version of GTF as quantification.