bebop / poly

A Go package for engineering organisms.
https://pkg.go.dev/github.com/bebop/poly
MIT License
671 stars 73 forks source link

JCVI-Syn3a fails to read #303

Closed Koeng101 closed 1 year ago

Koeng101 commented 1 year ago

Download .gb from https://www.ncbi.nlm.nih.gov/nuccore/CP016816.2

package main

import (
    "fmt"

    "github.com/TimothyStiles/poly/io/genbank"
)

func main() {
    sequence, _ := genbank.Read("jcvi_syn3a.gb")
    fmt.Println(sequence)

}
go run main.go
panic: runtime error: index out of range [1] with length 1

goroutine 1 [running]:
github.com/TimothyStiles/poly/io/genbank.ParseMultiNth({0x4f5798?, 0xc00011c048?}, 0x0?)
        /home/koeng/go/pkg/mod/github.com/!timothy!stiles/poly@v0.24.0/io/genbank/genbank.go:594 +0x233e
github.com/TimothyStiles/poly/io/genbank.ReadMultiNth({0x4d4020?, 0x0?}, 0xc00008d9b8?)
        /home/koeng/go/pkg/mod/github.com/!timothy!stiles/poly@v0.24.0/io/genbank/genbank.go:184 +0x4b
github.com/TimothyStiles/poly/io/genbank.Read({_, _})
        /home/koeng/go/pkg/mod/github.com/!timothy!stiles/poly@v0.24.0/io/genbank/genbank.go:164 +0x65
main.main()
        xxx/scripts/jcvi_syn3a/main.go:10 +0x65
exit status 2
Koeng101 commented 1 year ago

It is due to this ridiculous feature:

     CDS             3207..4007
                     /gene="ksgA"
                     /locus_tag="JCVISYN3A_0004"
                     /inference="EXISTENCE: similar to AA
                     sequence:RefSeq:WP_011166215.1"
                     /codon_start=1
                     /transl_table=4
                     /product="16S rRNA
                     (adenine(1518)-N(6)/adenine(1519)-N(6))-
                     dimethyltransferase"
                     /protein_id="AVX54572.1"
                     /translation="MKAKKYYGQNFISDLNLINKIVDVLDQNKDQLIIEIGPGKGALT
                     KELVKRFDKVVVIEIDKDMVEILKTKFNHSNLEIIQADVLEIDLKQLISKYDYKNISI
                     ISNTPYYITSEILFKTLQISDLLTKAVFMLQKEVALRICSNKNENNYNNLSIACQFYS
                     QRNFEFVVNKKMFYPIPKVDSAIISLTFNDIYKKQVNNDKKFIDFVRLLFNNKRKTIL
                     NNLNNIIQNKNKALEYLNTLNISSNLRPEQLDIDQYIKLFNLIYNSNF"