bioperl / bioperl-live

Core BioPerl 1.x code
http://bioperl.org
295 stars 182 forks source link

Genetic code 15 need to be supported #389

Closed igortru closed 4 months ago

igortru commented 4 months ago

BioPerl 1.7.8

I have found file where information about supported genetic codes located

/usr/local/prokka/1.14.5/lib/perl5/lib/perl5/Bio/Tools/CodonTable.pm it looks like GC15 in current release just empty.

https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG15

as result prokka failed correctly process genomes from https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=3068822

 '', '',
 'Echinoderm and Flatworm Mitochondrial',#9
 'Euplotid Nuclear',#10
 'Bacterial, Archaeal and Plant Plastid',# 11
 'Alternative Yeast Nuclear',# 12
 'Ascidian Mitochondrial',# 13
 'Alternative Flatworm Mitochondrial',# 14
 '',
 'Chlorophycean Mitochondrial',# 16
 '', '',  '', '',
 'Trematode Mitochondrial',# 21
 'Scenedesmus obliquus Mitochondrial', #22

        FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSGGVVVVAAAADDEEGGGG
   FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG
   ''
   FFLLSSSSYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
   '' '' '' ''
igortru commented 4 months ago

I have created new pull request #390

carandraug commented 4 months ago

I'm looking into this and found that codon table 15 was removed as part of #256 (codon table 15 used to exist but was removed). I'm guessing it was added again.

carandraug commented 4 months ago

Looking into the asn.1 version at ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt which includes version history, it mentions code 15 being added in 1995 and no reference of ever having being removed. Wikipedia mentions that the translation table 15 is "As of Nov. 18, 2016: absent from the NCBI update" so I'm guessing that for sometime was removed from some places. Anyway, I'll accept the patch that adds it back.

igortru commented 4 months ago

it is long story… and new incarnation of GC15. Now it is genetic code which use prokaryotic viruses inherited from genetic code 11. But for some reason NCBI decided preserve old name.

see for example https://www.nature.com/articles/s41467-022-32979-6

carandraug commented 4 months ago

Very interesting, thank you for the extra details. I also noticed that besides the codon table 15 there is a new codon table and other differences with the latest version. I will also push a commit with the changes for the current NCBI version of codon tables.

But the issue ere is fixed now, so I'm closing it.