BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
208 stars 71 forks source link

error correct #32

Closed YannAudic closed 5 years ago

YannAudic commented 5 years ago

Hi ,

I try to have flair running on our Nanopore data. I aligned reads using minimap, converted bam to bed 12 then ran the flair correct script with

python flair.py correct -c /mnt/BIG/MINION/sizes.genome -g /mnt/BIG/MINION/GRCh38.fa -q /mnt/BIG/MINION/aln_test_52_sorted.bed12 -f annotation_trimmed_GRCh38.gtf --print_check -t 4 -o /mnt/BIG/MINION/test_52

the script is staying for a while at 60 % in the fifth step but apparently still running; I got the following error message in the err_tmp file (just the part where the error starts to appear). any hint ?

Thank you, Yann

Correcting /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr2_temp_reads.bed with a wiggle of 15 against /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr2_known_juncs.bed. Checking splice sites with genome /mnt/BIG/MINION/GRCh38.fa. Initializing int tree for chromosome chr2 Checking SS motifs for chromosome chr2 Checked 19801 splice sites for chromosome chr5... Adding to int tree ** Unsuccessful correction for chromosome chr5 Traceback (most recent call last): File "/root/flair/bin/ssPrep.py", line 479, in main() File "/root/flair/bin/ssPrep.py", line 470, in main intTree, ssData = buildIntervalTree(knownJuncs, wiggle, fa) File "/root/flair/bin/ssPrep.py", line 398, in buildIntervalTree if dinucDict[c1][0] != strand or dinucDict[c2][0] != strand: KeyError: 110739040

Correcting /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr1_temp_reads.bed with a wiggle of 15 against /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr1_known_juncs.bed. Checking splice sites with genome /mnt/BIG/MINION/GRCh38.fa. Initializing int tree for chromosome chr1 Checking SS motifs for chromosome chr1 Checked 15299 splice sites for chromosome chr4... Adding to int tree ** Unsuccessful correction for chromosome chr4 Traceback (most recent call last): File "/root/flair/bin/ssPrep.py", line 479, in main() File "/root/flair/bin/ssPrep.py", line 470, in main intTree, ssData = buildIntervalTree(knownJuncs, wiggle, fa) File "/root/flair/bin/ssPrep.py", line 398, in buildIntervalTree if dinucDict[c1][0] != strand or dinucDict[c2][0] != strand: KeyError: 75913968

Correcting /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr9_temp_reads.bed with a wiggle of 15 against /root/flair/tmp_a08014da-94b6-4222-9c78-002cd0d37027/chr9_known_juncs.bed. Checking splice sites with genome /mnt/BIG/MINION/GRCh38.fa. Initializing int tree for chromosome chr9 Checking SS motifs for chromosome chr9 Checked 9625 splice sites for chromosome chr9... Adding to int tree ** Unsuccessful correction for chromosome chr9 Traceback (most recent call last): File "/root/flair/bin/ssPrep.py", line 479, in main() File "/root/flair/bin/ssPrep.py", line 470, in main intTree, ssData = buildIntervalTree(knownJuncs, wiggle, fa) File "/root/flair/bin/ssPrep.py", line 398, in buildIntervalTree if dinucDict[c1][0] != strand or dinucDict[c2][0] != strand: KeyError: 64440180

YannAudic commented 5 years ago

note that when using data for a single chromosome it works.

csoulette commented 5 years ago

Hi YannAudic,

We've fixed the issue causing the correct step to hang.

Regarding the key error issue, it looks like a reference database isn't getting built correctly during correction. You mention that running a single chromosome works fine, but is this true for the chromosomes that seem to failing according your error output?

Thanks~

-CMS

csoulette commented 5 years ago

Hi YannAudic,

Just wanted to mention that we've changed the splice site motif checking step in ssCorrect so that it no longer uses dictionaries to store splice site information, and therefore you should no longer run into any key error issues. You can check out the latest commit -> 9ef7890. Alternatively, you can pull an earlier version that does not have splice site motif checking ( 71c775791d1a8122310a64a1608558dc0e051a04 ).

Thanks~

-CMS