mibig-secmet / mibig-json

Repository to track changes in MIBiG curation data stored in JSON format
7 stars 6 forks source link

Error with BGC0002687 #269

Open Sam-Will opened 1 year ago

Sam-Will commented 1 year ago

BGC0002687

This entry is missing a central gene identified in the paper but seems missing for the MiBIG entry. It also doesn't appear to be recognised if you run the assembled genome through antiSMASH.

Could we have this gene added in?

Reference: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6604713/

Screenshot 2023-01-23 at 20 50 04 Screenshot 2023-01-23 at 20 52 22
kblin commented 1 year ago

When the gene cluster was submitted, the wrong coordinates were provided. Additionally, the genome annotations list RS02190 and RS02195 as pseudogenes, which antiSMASH ignores. Having said that, this looks like a bug in the version of PGAP that was used to annotate the record. PGAP 2.10 got confused by NRPS and PKS genes a lot. The current annotation in RefSeq seems to have the genes.

Having said that, because the NCBI keeps re-annotating RefSeq all of the time, breaking locus tags and protein IDs, we generally try to avoid using RefSeq entries. I'll add the missing genes back in the MIBiG entry.