bebop / poly

A Go package for engineering organisms.
https://pkg.go.dev/github.com/bebop/poly
MIT License
665 stars 70 forks source link

Fix #303 and add regression test #304

Closed Koeng101 closed 1 year ago

Koeng101 commented 1 year ago

Fixes #303

Basically, there were two weird features in the JCVI-Syn3a genome:

     CDS             3207..4007
                     /gene="ksgA"
                     /locus_tag="JCVISYN3A_0004"
                     /inference="EXISTENCE: similar to AA
                     sequence:RefSeq:WP_011166215.1"
                     /codon_start=1
                     /transl_table=4
                     /product="16S rRNA
                     (adenine(1518)-N(6)/adenine(1519)-N(6))-
                     dimethyltransferase"
                     /protein_id="AVX54572.1"
                     /translation="MKAKKYYGQNFISDLNLINKIVDVLDQNKDQLIIEIGPGKGALT
                     KELVKRFDKVVVIEIDKDMVEILKTKFNHSNLEIIQADVLEIDLKQLISKYDYKNISI
                     ISNTPYYITSEILFKTLQISDLLTKAVFMLQKEVALRICSNKNENNYNNLSIACQFYS
                     QRNFEFVVNKKMFYPIPKVDSAIISLTFNDIYKKQVNNDKKFIDFVRLLFNNKRKTIL
                     NNLNNIIQNKNKALEYLNTLNISSNLRPEQLDIDQYIKLFNLIYNSNF"

The product line (adenine(1518)-N(6)/adenine(1519)-N(6))- messed up because it had an internal /, which we were not checking for. Additionally, there are genes like:

CDS             complement(44397..45011)
                     /locus_tag="JCVISYN3A_0051"
                     /inference="EXISTENCE: similar to AA
                     sequence:RefSeq:WP_017698191.1"
                     /pseudo
                     /codon_start=1
                     /transl_table=4

that messed up because we assumed all qualifiers would have a = associated with them. Turns out, assumption false!

TimothyStiles commented 1 year ago

@Koeng101 can you make the test file smaller?

Koeng101 commented 1 year ago

@TimothyStiles fixed.

Koeng101 commented 1 year ago

Turns out this is also needed to read https://www.ncbi.nlm.nih.gov/nuccore/NC_001133.9