Closed andreismol closed 4 years ago
Hi @andreismol ,
To your Q1:
LowCover: correct splicing at column 19 + intron average depth at column 9 < 10, meaning the overall sequencing depth for this event is low.
LowSplicing: correct splicing at column 19 < 4, meaning not enough reads supporting the correct splicing.
MinorIsoform: correct splicing at column 19 * 1.33333333 < max(spliceLeft at column 17, spliceRight at column 18), meaning the event is not the main/most common splicing outcome among all transcripts of this gene.
NonUniformIntronCover: this is a bit complicated as follow:
(max(SPleft, SPright) > intronTrimmedMean+2 && max(SPleft, SPright) > intronTrimmedMean*1.5
where SPleft, SPright, intronTrimmedMean are column 13, 14 and 19, respectively. This tag examine if read coverage is evenly distributed in the exonic and intronic region of an event.
All the above is defined in ReadBlockProcessor_CoverageBlocks.cpp
under IRFinder/src/irfinder
.
Q2: We filter for sufficient correct splicing for two reasons:
NonUniformIntronCover
. Specifically to NonUniformIntronCover
, sometimes the cutoff might be too stringent due to its complicated criteria.
With that being said, the cutoffs you listed in your Q2 are general guidance. I recommend you to investigate your own data to figure out which combination suits you better.Q3: Your understanding of static warning is right. We indeed exclusion those contaminated regions, but not the entire events, in the calculation. Thus, users have their freedom to apply there own choice whether or not to exclude the whole event if it is not clean.
Best, Dadi
Thank you, Dadi! Much appreciated.
Hi,
Thanks for the tool. It seems to be working excellently. I've got a few questions that I'd like to clarify.
Could you provide the definitions of the four Warnings in the last column of the IRFinder-IR-nondir output file (LowCover, LowSplicing, MinorIsoform, NonUniformIntronCover). Specifically, at what cutoff do LowCover and LowSplicing get triggered?
I was wondering whether you might be able to provide some advice on which columns to filter to get a set of high quality introns in each sample. In a response you gave to a previous issue, you mentioned that you filter on:
-
orNonUniformIntronCover
What's the logic behind these particular columns? Obviously you would want to exclude introns which are poorly covered, but why exclude on the basis of low numbers of "correct" splicing? Why include introns with
NonUniformIntronCover
? And are there any other columns which one should filter on? (at the moment I'm filtering on IntronDepth>3 and IRratio>0.1)Thanks again for all the effort involved in producing and documenting IRFinder!
-Andrei