brinkmanlab / IslandCompare

Pipeline for detecting and annotating genomic islands and relationships between the respective genomes
Other
4 stars 2 forks source link

Mauve Contig Mover infinite loop #198

Closed innovate-invent closed 3 years ago

innovate-invent commented 4 years ago

When running this dataset:

DEFINITION  Enterococcus faecium strain strain.
ACCESSION   VRE-0027
VERSION     VRE-0027

on reference NZ_LR135364.1 Mauve Contig Mover runs indefinitely.

Changing to a different reference strain (NZ_LR135254_1) succeeded normally.

@klgray25 where did you source this data?

klgray25 commented 4 years ago

The draft was assembled with unicycler and annotated with prokka

innovate-invent commented 4 years ago

Where was the draft sourced? I need to document how to reproduce this issue.

innovate-invent commented 3 years ago

this issue was also encountered with ERR388703.gbk and NZ_LN999987.1 reference. Looking at the NZ_LN999987.1 fasta it appears that plasmids are included at the end. Removing the plasmid resolves the lockup.

grep '^>' Bacteria/Enterococcus_faecium/GCF_900044005.1_ASM90004400v1/GCF_900044005.1_ASM90004400v1_genomic.fna                
>NZ_LN999987.1 Enterococcus faecium isolate EFE11651 chromosome I, complete sequence
>NZ_LN999988.1 Enterococcus faecium isolate EFE11651 plasmid II, complete sequence
>NZ_LN999989.1 Enterococcus faecium isolate EFE11651 plasmid III, complete sequence
>NZ_LN999990.1 Enterococcus faecium isolate EFE11651 chromosome IV, complete sequence
innovate-invent commented 3 years ago

NZ_LR135364.1 appears to contain plasmids in its reference fasta aswell. I wonder if MCM fails when a contig maps into a plasmid?

grep '^>' Bacteria/Enterococcus_faecium/GCF_900639565.1_E8195_hybrid_assembly/GCF_900639565.1_E8195_hybrid_assembly_genomic.fna 
>NZ_LR135364.1 Enterococcus faecium isolate E8195 chromosome 1
>NZ_LR135365.1 Enterococcus faecium isolate E8195 plasmid 2
>NZ_LR135366.1 Enterococcus faecium isolate E8195 plasmid 3
>NZ_LR135367.1 Enterococcus faecium isolate E8195 plasmid 4
>NZ_LR135368.1 Enterococcus faecium isolate E8195 plasmid 5
>NZ_LR135369.1 Enterococcus faecium isolate E8195 plasmid 6
>NZ_LR135370.1 Enterococcus faecium isolate E8195 plasmid 7
>NZ_LR135371.1 Enterococcus faecium isolate E8195 plasmid 8
innovate-invent commented 3 years ago

The wrong file was being referenced from MicrobeDB. https://github.com/brinkmanlab/galaxy-tools/commit/1971552f47a8fa16db60c2fb064b65673495e955 fixes this.

innovate-invent commented 3 years ago

53cb5ef034c39fe9d6701448163524e8d6a97752 includes logic to kill MCM if it runs too long