Closed jaudall closed 2 years ago
Are these 5 crashes happening for same genome or multiple genomes are involved?
I assume that the crash would be happening while generating snps.txt
file, if so then you can check the last line to see which chromosomes are potentially involved and then try analysing them separately.
As such, I do not know what is causing this, so I would need a minimalist example to reproduce this and then check.
Most of them involve the same genome. However, Syri worked ok with that 'buggy genome' when it was used as the query instead of the target.
How do I limit syri to a chromosome? I don't see that as an option.
I turned off snps (--nosnps) and that seemed to 'fix' it for my immediate needs.
How do I limit syri to a chromosome? I don't see that as an option.
You would need to extract those chromosomes into separate files and realing and analyse. Since, it is happening in only one genome I would suggest to check the genome fasta itself. There could be something weird in the fasta that gets transmitted to alignments and specifically the CIGAR string resulting in this crash.
hmm, any suggestions on what character to look for? The sequences were all passed through a bioperl seqIO and should be ok. It seems like if it causing a problem when used as a query, the same character would cause a problem when the sequence is used as the target ... but it isn't. I'm happy to share files ... :-)
any suggestions on what character to look for?
Not really.
Are you using syriv1.5.5? This version accepts PAF file as input as well. Could you please try that and see whether you get the same error with it as well? If the issue persists, then could you please share the genomes fasta (the problematic and one non-problematic genome).
I've run 90 or so sryi jobs to create output for plotsr using 10 different genomes. Of those jobs, 5 of them consistently fail. I re-ran the alignments thinking the bam may have been corrupted, but I still get the same error with syri.
I'm using this command: syri -c ${i}${j}.aln.bam --dir ${i}${j} -r ${i}.genome.fasta -q ${j}.genome.fasta -F B --prefix ${i}_${j} --all --log DEBUG
and the output isn't much help: Begin Time: Thu Apr 28 10:19:31 CDT 2022 Traceback (most recent call last): File "/project/cotton_genomics/miniconda3_syri/envs/syriX/bin/syri", line 4, in
import('pkg_resources').run_script('syri==1.5.5', 'syri')
File "/project/cotton_genomics/miniconda3_syri/envs/syriX/lib/python3.9/site-packages/pkg_resources/init.py", line 672, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/project/cotton_genomics/miniconda3_syri/envs/syriX/lib/python3.9/site-packages/pkg_resources/init.py", line 1472, in run_script
exec(code, namespace, namespace)
File "/project/cotton_genomics/miniconda3_syri/envs/syriX/lib/python3.9/site-packages/syri-1.5.5-py3.9-linux-x86_64.egg/EGG-INFO/scripts/syri", line 6, in
main(sys.argv[1:])
File "/project/cotton_genomics/miniconda3_syri/envs/syriX/lib/python3.9/site-packages/syri-1.5.5-py3.9-linux-x86_64.egg/syri/scripts/syri.py", line 319, in main
syri(args)
File "/project/cotton_genomics/miniconda3_syri/envs/syriX/lib/python3.9/site-packages/syri-1.5.5-py3.9-linux-x86_64.egg/syri/scripts/syri.py", line 246, in syri
getshv(args, coords, chrlink)
File "syri/pyxFiles/findshv.pyx", line 257, in syri.findshv.getshv
File "syri/pyxFiles/findshv.pyx", line 296, in syri.findshv.getshv
IndexError: string index out of range
End Time:
Thu Apr 28 10:22:34 CDT 2022