flemingtonlab / SpliceTools

GNU General Public License v3.0
18 stars 5 forks source link

Splice site score filtering? #7

Open robertfisher002 opened 1 year ago

robertfisher002 commented 1 year ago

Hi,

The splice site scoring is a nice feature. I ran the perl script for SE SS scoring and noticed that the amount of sequences I get for both donor and acceptor scores are quire low. I pre-filter the input to reflect (deltaPSI > |0.1| and FDR < 0.05) so there should be the same amount of pos and negInc sequences output as there are inputs (since the only filter used by SpliceTools is the FDR < 0.05) but if I have around 2000 neginc inputs, I noticed there are only ~100 output donor sequences (for instance). I'm wondering if there's a reason not many splice site sequences are included?

python gtf2bed.py gencode.v42.annotation.gtf > BED12_hg38_v42.bed

perl SESpliceSiteScoring.pl -s ../../SE.MATS.JCEC.txt -g ../../GRCh38.p13.genome.fa -f 0.05 -a ../../BED12_hg38_v42.bed

Thanks! Robert

nungerleider commented 1 year ago

Hi Robert, we tried to reproduce this issue but couldn't - would you mind sending your input and output files?

Thanks,

Nate