schneebergerlab / syri

Synteny and Rearrangement Identifier
https://schneebergerlab.github.io/syri/
MIT License
323 stars 35 forks source link

IndexError: list index out of range (but no error when swapping the reference and query) #176

Closed cistarsa closed 1 year ago

cistarsa commented 1 year ago

Hello Minash and appreciate this useful program. I've noticed others have opened issues with this error but I'm wondering if there's something unique to my case as the issue is resolved when I generate a sam while swapping the query and reference in minimap2. I'm hoping to track the inheritance of a haplotype by using the F1 as the reference and mapping each parental strain to the progeny. However, I consistently have issues when I map one parental scaffold to the progeny, but not when I map the progeny scaffold to the parent. I've tried reversecomplementing etc to no avail. here is the log file, thank you!

# sam generation via minimap2:

minimap2 -ax asm5 --eqx BAR3230_D13.fasta PSC355_D13_reversecomplement.fa >> rcDD13_PSC355_BAR3230.sam
[DD13_PSC355Bar3230_testsyri.log](https://github.com/schneebergerlab/syri/files/10341290/DD13_PSC355Bar3230_testsyri.log)

#syrirun 
syri -c rcDD13_PSC355_BAR3230.sam -q PSC355_D13_reversecomplement.fa -r BAR3230_D13.fasta -F B --prefix DD13_PSC355Bar3230_test2

stderr:
Reading Coords - WARNING - Reference chromosome D13 has high fraction of inverted alignments with its homologous chromosome in the query genome (D13). Ensure that same chromosome-strands are being compared in the two genomes, as different strand can result in unexpected errors.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "syri/pyxFiles/synsearchFunctions.pyx", line 803, in syri.synsearchFunctions.syri
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/syri", line 4, in <module>
    __import__('pkg_resources').run_script('syri==1.6.3', 'syri')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 656, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1453, in run_script
    exec(code, namespace, namespace)
  File "/usr/local/lib/python3.10/dist-packages/syri-1.6.3-py3.10-linux-x86_64.egg/EGG-INFO/scripts/syri", line 6, in <module>
    main(sys.argv[1:])
  File "/usr/local/lib/python3.10/dist-packages/syri-1.6.3-py3.10-linux-x86_64.egg/syri/scripts/syri.py", line 326, in main
    syri(args)
  File "/usr/local/lib/python3.10/dist-packages/syri-1.6.3-py3.10-linux-x86_64.egg/syri/scripts/syri.py", line 214, in syri
    startSyri(args, coords[["aStart", "aEnd", "bStart", "bEnd", "aLen", "bLen", "iden", "aDir", "bDir", "aChr", "bChr"]])
  File "syri/pyxFiles/synsearchFunctions.pyx", line 505, in syri.synsearchFunctions.startSyri
  File "syri/pyxFiles/synsearchFunctions.pyx", line 506, in syri.synsearchFunctions.startSyri
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
IndexError: list index out of range```
cistarsa commented 1 year ago

DD13_PSC355Bar3230_test2syri.log

mnshgl0110 commented 1 year ago

Hi, Yes, this indeed sounds weird. Could you please retry running syri with the -f parameter. If that does not work, could you please share the two chromosomes. Best Manish

cistarsa commented 1 year ago

Appreciate the quick reply, Manish! I tried the -f flag without resolve so here're the fastas. Ideally, I'd map PSC355 (parental) to BAR3230 (progeny BAR3230_D13.fasta.zip ), thank you! PSC355_D13.fasta.zip

mnshgl0110 commented 1 year ago

The chromosomes are from different strands. Reverse complementing one chromosome fixed the issue. Check https://github.com/schneebergerlab/fixchr

cistarsa commented 1 year ago

Awesome, thank you! I had tried reversecomplementing the PSC355 via bash:

cat PSC355_D13_singleline.fa | while read L; do  echo $L; read L; echo "$L" | rev | tr "ATGC" "TACG" ; done >> PSC355_D13_reversecomplement.fa

...should I do this another way, does fixchr produce rc chroms? Thanks again!

mnshgl0110 commented 1 year ago

Yes, fixchr should rc the chromosomes.

cistarsa commented 1 year ago

Thank you! When I attempted WGS x WGS I ran into an issue, but I can run each chromosome independently. Closing for now.