snayfach / MIDAS

An integrated pipeline for estimating strain-level genomic variation from metagenomic data
http://dx.doi.org/10.1101/gr.201863.115
GNU General Public License v3.0
124 stars 52 forks source link

Corrupted genome.features files for certain species #70

Open snayfach opened 7 years ago

snayfach commented 7 years ago

Some genomes downloaded from PATRIC had genes with incorrect genome coordinates. For example, gene id 1313.4609.peg.100 from genome id 1313.4609 from species id Streptococcus_pneumoniae_58285.

This genome was identified and removed on the PATRIC site (now exists as genome id 1313.10646)

The MIDAS database should be updated to get the fixed genome files. Additionally, all CDS coordinates in genome.feature files should be validated before inclusion in the MIDAS db

snayfach commented 7 years ago

Here is the full list of species where this problem occurs: Bacillus_cereus_57918, Streptococcus_pneumoniae_57684, Streptococcus_pneumoniae_58285, Streptococcus_pneumoniae_60081, Streptococcus_pneumoniae_60082 Streptococcus_pneumoniae_60083 Streptomyces_scabiei_62107