mikolmogorov / Ragout

Chromosome-level scaffolding using multiple references
Other
149 stars 27 forks source link

Sibeliaz .maf input leads to unidentified error in Ragout #68

Closed rstewa03 closed 4 years ago

rstewa03 commented 4 years ago

I am encountering an error while running Ragout using an .maf. I created the .maf file with SibeliaZ. Prior to running SibeliaZ, I modified the headers of both my target fasta and reference fasta to follow the format: genome.scaffold, e.g.

PmacPolish.Sc0000000_pilon PmacPolish.Sc0000001_pilon PmacPolish.Sc0000002_pilon PmacPolish.Sc0000003_pilon PmacPolish.Sc0000004_pilon PmacPolish.Sc0000005_pilon PmacPolish.Sc0000006_pilon PmacPolish.Sc0000007_pilon PmacPolish.Sc0000008_pilon

The error occurs during the 'detecting chimeric adjacencies', specifically when checking block size 100.

[15:42:49] INFO: Detecting chimeric adjacencies Traceback (most recent call last): File "/data/programs/Ragout_v2.3/bin/ragout", line 32, in sys.exit(main()) File "/data/programs/Ragout_v2.3/ragout/main.py", line 295, in main _run_ragout(args) File "/data/programs/Ragout_v2.3/ragout/main.py", line 200, in _run_ragout chim_detect = ChimeraDetector(raw_bp_graphs, run_stages, target_sequences) File "/data/programs/Ragout_v2.3/ragout/breakpoint_graph/chimera_detector.py", line 28, in init self._make_hierarchical_breaks() File "/data/programs/Ragout_v2.3/ragout/breakpoint_graph/chimera_detector.py", line 64, in _make_hierarchical_breaks break_pos = self._optimal_break(seq_name, *adjusted_break) File "/data/programs/Ragout_v2.3/ragout/breakpoint_graph/chimera_detector.py", line 72, in _optimal_break seq = self.target_seqs[seq_name] KeyError: 'Sc0000008_pilon'

What is causing the error and how might it be resolved?

mikolmogorov commented 4 years ago

Hi,

Could you please post ragout.log and the recipe files?

Mikhail

rstewa03 commented 4 years ago

Thansk so much for your quick response. The files are attached. PmacRagout.zip

mikolmogorov commented 4 years ago

I am assuming the issue has been resolved by removing dot symbol from the reference genome name (based on your response in #67)? Please continue the discussion in #67 if you have any follow-ups.