williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

excessive Mitochondrial component and seemingly not mRNA-seq exp #96

Closed ahorvath closed 4 years ago

ahorvath commented 4 years ago

Dear Authors,

First of all, many thanks for developing this great tool. I work with fission yeast and I have KO samples in which extensive intron retention expected. I used IRFinder on polyA mRNA-seq samples to discover this and the results look very promising, however I got the following warnings:

WARN: This sample has an excessive Mitochondrial component. This may indicate a specialised experiment. It may not be suitable to be used in comparisons with more regular experiments. WARN: Very low portion of reads have a splice junction. This may indicate the experiment is not an mRNA-Seq experiment.

Can you advise why this is happening?

I have some MT and MTR exons but they represent only a minority.

Many thanks, Attila

dg520 commented 4 years ago

Hi @ahorvath ,

I'm not familiar with yeast genome. For the first warning, IRFinder checks the output file IRFinder-ChrCoverage.txt. It assumes mitochondria chromosome is named as "M" or "MT" while non-mitochondria chromosomes are named with digits. This assumption is true for human and mouse genome, is it also true for yeasts? IRFinder sums the numbers on mitochondria chromosome as value A and sums the numbers on non-mitochondira chromosomes. If A/(A+B+1) > 0.4, the warning will be thrown out.

For the first warning, IRFinder checks the output file irfinder.stdout. It explicitly extracts values for "Total nucleotides", "Total pairs processed", "Total singles processed" and "Error reads" as A, B, C and D. If A/(B+C+D) > 30, the warning will be thrown out.

Best, Dadi

UPDATE: For the 1st warning, I just checked that main chromosomes in yeasts are named by Rome number instead of digits. So IRFinder won't get value B correctly for yeasts. You can safely ignore it. For the 2nd warning, values B and C indicate detected splice junctions. The cutoff of 30 is set according to human and mouse. If the distribution of splice junctions in yeast genome is far sparse than that in human or mouse genome, this warning won't be too surprising for me. You might want to do a simple calculation to convince yourself.

ahorvath commented 4 years ago

Hi Dadi,

Many thanks for your answer, I just checked the pombe genome and your answer is correct. I will check the second, too. Can I infer the detected splice sites accompanied with the IR events?

Many thanks, Attila

dg520 commented 4 years ago

Hi @ahorvath ,

A detected splice site does not necessarily mean an IR event. Actually most of the detected splice sites should refer to annotated splicing, where no IR at all.

Best, Dadi

ahorvath commented 4 years ago

I agree with you, however I think, there could be accompanied splice sites (other than the annotated ones) coming from alternative exons for example.

Bests, Attila

dg520 commented 4 years ago

That's totally possible.

Best, Dadi

ahorvath commented 4 years ago

Thanks for your help and the good discussion!

Bests, Attila