schneebergerlab / syri

Synteny and Rearrangement Identifier
https://schneebergerlab.github.io/syri/
MIT License
323 stars 35 forks source link

generating table output error #178

Closed RCLynch414 closed 1 year ago

RCLynch414 commented 1 year ago

Hi, I'm looking for help with a minimap2 > syri run. I tested the example data set in pipeline.sh and it works correctly on my machine. When I try my target dataset though, syri fails at the generating Table output step.

I've already checked the common issues, rf/qry chromosome names are synced up and the chromosome orientations are correct so there isn't an excess of inversions.

Commands:

minimap2 -ax asm5 --eqx EH23a.unmasked_chr4SM.fasta KCDv1a.unmasked_chr4SM.fasta > out.sam syri -c out.sam -r EH23a.unmasked_chr4SM.fasta -q KCDv1a.unmasked_chr4SM.fasta -k -F S

terminal output:

ctxOut.txt is empty. Skipping analysing it. ctxOut.txt is empty. Skipping analysing it. ctxOut.txt is empty. Skipping analysing it. Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/syri_env/bin/syri", line 6, in main(sys.argv[1:]) File "/home/ubuntu/miniconda3/envs/syri_env/lib/python3.8/site-packages/syri/scripts/syri.py", line 326, in main syri(args) File "/home/ubuntu/miniconda3/envs/syri_env/lib/python3.8/site-packages/syri/scripts/syri.py", line 266, in syri getTSV(args.dir, args.prefix, args.ref.name, args.hdrseq, args.maxs) File "syri/pyxFiles/writeout.pyx", line 564, in syri.writeout.getTSV File "syri/pyxFiles/writeout.pyx", line 568, in syri.writeout.getTSV KeyError: 'chr4'

syri.log:

2023-02-15 17:19:02,909 - Reading Coords - INFO - syri:135 - Reading input from SAM file 2023-02-15 17:19:03,076 - syri - INFO - syri:214 - starting 2023-02-15 17:19:03,076 - syri - INFO - syri:214 - Analysing chromosomes: ['chr4'] 2023-02-15 17:19:03,091 - syri.chr4 - INFO - mapstar:48 - chr4 (117, 11) 2023-02-15 17:19:03,092 - syri.chr4 - INFO - mapstar:48 - Identifying Synteny for chromosome chr4 2023-02-15 17:19:03,154 - syri.chr4 - INFO - mapstar:48 - Identifying Inversions for chromosome chr4 2023-02-15 17:19:03,243 - syri.chr4 - INFO - mapstar:48 - Identifying translocation and duplication for chromosome chr4 2023-02-15 17:19:03,672 - Brute-force TD identification - INFO - mapstar:48 - Cluster is too big for Brute Force, using rand Time taken for last iteration 0.00020694732666015625. iterations remaining 37 2023-02-15 17:19:03,862 - getCTX - INFO - syri:214 - Identifying cross-chromosomal translocation and duplication for chromos 2023-02-15 17:19:03,975 - local_variation - INFO - syri:225 - Finding SVs in synOut.txt, invOut.txt, TLOut.txt, invTLOut.txt 2023-02-15 17:19:04,273 - local_variation - INFO - syri:245 - Finding SNPs and small indels 2023-02-15 17:19:04,552 - local_variation - INFO - syri:257 - Combining outputs 2023-02-15 17:19:04,553 - local_variation - INFO - syri:262 - Generating table output

mnshgl0110 commented 1 year ago

Hi @RCLynch414 , could you please rerun syri with --log DEBUG parameter (and then share the new log file)? Also, please delete any of the intermediate syri output files from the working directory. Ideally, run in a new folder containing only the genomes fasta and the alignment file.

RCLynch414 commented 1 year ago

Ah, thank you. Moving the inputs into a clean directory and executing there solved the issue.