mibig-secmet / mibig-json

Repository to track changes in MIBiG curation data stored in JSON format
6 stars 6 forks source link

Error with BGC0002701 #263

Open drboothtj opened 1 year ago

drboothtj commented 1 year ago

Affected BGC BGC0002701

Describe the error Contains a duplicate CDS with protein_id = WP_187144229.1

zreitz commented 9 months ago

Those CDSs come from RefSeq. Probably a true duplicated gene?

kblin commented 9 months ago

Protein IDs are not guaranteed to be unique. That assertion changed many years ago when they introduced the non-redundant protein set https://www.ncbi.nlm.nih.gov/refseq/about/nonredundantproteins/

Do we have any evidence that one of these CDSes isn't real?