INFO: Reading contigs file : KeyError: 'chr1'

sklages commented 8 years ago

Hi,

running ragout --outdir ragout_1.MAF.out --synteny maf --no-refine $(pwd)/athCun_vs_galGal.maf.rcp resulted in a quite late error:

[14:34:14] INFO: Starting Ragout v1.2
[14:34:14] WARNING: Maf support is deprecated and will be removed in future releases. Use hal istead.
[14:34:14] INFO: Converting MAF to synteny
[14:34:14] INFO: Running maf2synteny module
[14:34:30] INFO: Phylogeny is taken from the recipe
[14:34:30] INFO: 'athCun' is chosen as a naming reference
[14:34:30] INFO: Processing permutation files
[14:38:10] INFO: Stage "10000"
[14:38:11] INFO: Inferring missing adjacencies
[14:38:12] INFO: Building scaffolds
[14:38:12] INFO: Stage "500"
[14:41:19] INFO: Inferring missing adjacencies
[14:41:26] INFO: Building scaffolds
[14:44:37] INFO: Merging two iterations
[14:45:23] INFO: Stage "100"
[16:36:06] INFO: Inferring missing adjacencies
[16:36:32] INFO: Building scaffolds
[18:28:02] INFO: Merging two iterations
[18:45:05] INFO: Stage "refine"
[20:38:10] INFO: Inferring missing adjacencies
[20:38:34] INFO: Building scaffolds
[22:30:11] INFO: Merging two iterations
[22:31:08] INFO: Reading contigs file
Traceback (most recent call last):
  File "/package/sequencer/ragout/current/ragout.py", line 32, in <module>
    sys.exit(main())
  File "/package/sequencer/ragout/Ragout-1.2/ragout/main.py", line 268, in main
    return run_ragout(args)
  File "/package/sequencer/ragout/Ragout-1.2/ragout/main.py", line 95, in run_ragout
    run_unsafe(args)
  File "/package/sequencer/ragout/Ragout-1.2/ragout/main.py", line 232, in run_unsafe
    out_gen.make_output(args.out_dir, recipe["target"])
  File "/package/sequencer/ragout/Ragout-1.2/ragout/scaffolder/output_generator.py", line 39, in make_output
    self._fix_gaps()
  File "/package/sequencer/ragout/Ragout-1.2/ragout/scaffolder/output_generator.py", line 82, in _fix_gaps
    left_ns, right_ns = count_ns(cnt_1, cnt_2)
  File "/package/sequencer/ragout/Ragout-1.2/ragout/scaffolder/output_generator.py", line 63, in count_ns
    seq_1, seq_2 = get_seq(cnt_1), get_seq(cnt_2)
  File "/package/sequencer/ragout/Ragout-1.2/ragout/scaffolder/output_generator.py", line 57, in get_seq
    cont_seq = self.fragments_fasta[seq_name][seg_start:seg_end]
KeyError: 'chr1'

Any idea what is going wrong here?

best, Sven

mikolmogorov commented 8 years ago

Hi,

It is likely that chromosome naming in maf file is inconsistent with the names in fasta files. For example, if "genome.fasta" contain sequences "seq1", "seq2", "seq3", they should correspond to "genome_name.seq1", "genome_name.seq2", "genome_name.seq3" in maf, where "genome_name" is the name you used in the recipe file to reference the corresponding genome.

This should fix you problem. However, maf input is currently deprecated, we are about to switch to HAL format completely. If you used progressiveCactus for alignment, then you should already have hal file (it is easier to use, as well).

Importantly, other alignment tools will likely not work good for synteny blocks recovery (at least, results would be unreliable), since alignment produced by progressiveCactus has some special properties, which other do not have (alignment is non-overlapping).

sklages commented 8 years ago

Yeah. That's it. Didn't read the naming requirements correctly ... As for progressiveCactus .. I read about the very long runtimes ..!?

kspham commented 8 years ago

It ,of course, depends on the number and size of genomes. But it's good, and finishes in reasonable time unless you want to align dozens/hundreds of mammalian genomes. Son.

On Thu, Apr 14, 2016 at 3:43 AM, sklages notifications@github.com wrote:

Yeah. That's it. Didn't read the naming requirements correctly ... As for progressiveCactus .. I read about the very long runtimes ..!?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/fenderglass/Ragout/issues/10#issuecomment-209874617

sklages commented 8 years ago

I'll try. thanks.

mikolmogorov / Ragout

INFO: Reading contigs file : KeyError: 'chr1' #10