BioJulia / GenomicAnnotations.jl

MIT License
16 stars 4 forks source link

Trouble loading `.gb` file downloaded via benchling #13

Closed adityanprasad closed 1 year ago

adityanprasad commented 1 year ago

I'm trying to parse a .gb file that I exported via Benchling (it shouldn't make a difference), but for some reason readgbk just keeps running without ever ending. BioPython parses the file, so I assume the file is encoded correctly. I'm attaching the file below. Thanks in advance! Really appreciate the package

LOCUS       pAC03_(pQJ6_w/_mOrange 10073 bp ds-DNA     circular     17-SEP-2023
DEFINITION  .
FEATURES             Location/Qualifiers
     primer          596..615
                     /label="oLAC01"
                     /note="sequence: GATTCATTAATGCAGCTGGC"
                     /ApEinfo_revcolor="#d59687"
                     /ApEinfo_fwdcolor="#d59687"
     primer          621..642
                     /label="oQJ70"
                     /note="sequence: CGCGTAGCCTGTCAGAAATTGA"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     primer          621..642
                     /label="oQJ70"
                     /note="sequence: CGCGTAGCCTGTCAGAAATTGA"
                     /ApEinfo_revcolor="#f7977a"
                     /ApEinfo_fwdcolor="#f7977a"
     misc_feature    621..1698
                     /label="lacZ upstream"
                     /ApEinfo_revcolor="#ffef86"
                     /ApEinfo_fwdcolor="#ffef86"
     3'UTR           672..690
                     /label="KS1469"
                     /ApEinfo_revcolor="#b1ff67"
                     /ApEinfo_fwdcolor="#b1ff67"
     primer          801..819
                     /label="oLAC22"
                     /note="sequence: AAGTAGCAGTGAAAGCCAA"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     3'UTR           824..849
                     /label="KS584"
                     /ApEinfo_revcolor="#b1ff67"
                     /ApEinfo_fwdcolor="#b1ff67"
     misc_feature    824..849
                     /label="KS584\"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    923..942
                     /label="KS281\"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     primer          951..971
                     /label="oLAC19"
                     /note="sequence: AACACATAACCCTGCAGTAAG"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     primer          1033..1053
                     /label="oLAC09"
                     /note="sequence: AGTAGCTTCAAGCCATGAATC"
                     /ApEinfo_revcolor="#d59687"
                     /ApEinfo_fwdcolor="#d59687"
     primer          1196..1213
                     /label="oLAC23"
                     /note="sequence: CAACTTGGCTAGAACCGG"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     primer          1479..1498
                     /label="oQY48"
                     /note="sequence: CACCATTATGATGGCAATCG"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     misc_feature    1573..1595
                     /label="p1"
                     /ApEinfo_revcolor="#b1ff67"
                     /ApEinfo_fwdcolor="#b1ff67"
     misc_feature    complement(1678..1698)
                     /label="pr"
                     /ApEinfo_revcolor="#84b0dc"
                     /ApEinfo_fwdcolor="#84b0dc"
     misc_feature    1699..1733
                     /label="BBa_J23119"
                     /ApEinfo_revcolor="#75c6a9"
                     /ApEinfo_fwdcolor="#75c6a9"
     misc_feature    1734..1767
                     /label="FRT"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     misc_feature    1768..1787
                     /label="pf"
                     /ApEinfo_revcolor="#f8d3a9"
                     /ApEinfo_fwdcolor="#f8d3a9"
     misc_feature    1785..1819
                     /label="kanR promoter"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     primer          1815..1840
                     /label="oLAC30"
                     /note="sequence: TTTAAATACTGTAGAAAAGAGGAAGG"
                     /ApEinfo_revcolor="#f58a5e"
                     /ApEinfo_fwdcolor="#f58a5e"
     misc_feature    1834..1839
                     /label="RBS"
                     /ApEinfo_revcolor="#c6c9d1"
                     /ApEinfo_fwdcolor="#c6c9d1"
     CDS             1850..2644
                     /label="KanR"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    2626..2648
                     /label="KS445\"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     misc_feature    complement(2627..2644)
                     /label="ACM205\"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    2657..2678
                     /label="KS518\"
                     /ApEinfo_revcolor="#b1ff67"
                     /ApEinfo_fwdcolor="#b1ff67"
     misc_feature    complement(2657..2678)
                     /label="KS464\"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     3'UTR           2666..2717
                     /label="terminator"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    2676..2695
                     /label="ACM300\"
                     /ApEinfo_revcolor="#c6c9d1"
                     /ApEinfo_fwdcolor="#c6c9d1"
     misc_feature    complement(2720..2739)
                     /label="KS325\"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     misc_feature    2720..2759
                     /label="KS635\"
                     /ApEinfo_revcolor="#d6b295"
                     /ApEinfo_fwdcolor="#d6b295"
     misc_feature    complement(2720..2759)
                     /label="KS661\"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     primer          2729..2747
                     /label="oLAC32"
                     /note="sequence: CTCTCCTGAGTCCCACAAT"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     primer          complement(2729..2747)
                     /label="oLAC31"
                     /note="sequence: ATTGTGGGACTCAGGAGAG"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     misc_feature    2740..2759
                     /label="KS644\"
                     /ApEinfo_revcolor="#75c6a9"
                     /ApEinfo_fwdcolor="#75c6a9"
     misc_feature    2750..2760
                     /label="ACM325\"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    complement(2750..2760)
                     /label="ACM324\"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     primer          2929..2948
                     /label="oLAC04"
                     /note="sequence: CACTCCCGTTCTGGATAATG"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     primer          complement(2929..2948)
                     /label="oLAC33"
                     /note="sequence: CATTATCCAGAACGGGAGTG"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     misc_feature    3000..3005
                     /label="-35 box"
                     /ApEinfo_revcolor="#75c6a9"
                     /ApEinfo_fwdcolor="#75c6a9"
     promoter        3000..3028
                     /label="tac promoter"
                     /ApEinfo_revcolor="#75c6a9"
                     /ApEinfo_fwdcolor="#75c6a9"
     misc_feature    3006..3021
                     /label="Ptac"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     misc_feature    3022..3028
                     /label="-10 box"
                     /ApEinfo_revcolor="#c6c9d1"
                     /ApEinfo_fwdcolor="#c6c9d1"
     misc_binding    3032..3054
                     /label="LacO"
                     /ApEinfo_revcolor="#84b0dc"
                     /ApEinfo_fwdcolor="#84b0dc"
     misc_feature    3094..3127
                     /label="FRT"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     misc_feature    3170..3175
                     /label="RBS?"
                     /ApEinfo_revcolor="#f8d3a9"
                     /ApEinfo_fwdcolor="#f8d3a9"
     misc_feature    3181..3891
                     /label="mTFP"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     primer          3805..3824
                     /label="oLAC05"
                     /note="sequence: CACGACAAGGACTACAACAA"
                     /ApEinfo_revcolor="#c6c9d1"
                     /ApEinfo_fwdcolor="#c6c9d1"
     misc_feature    3911..3927
                     /label="Hairpin - terminator"
                     /ApEinfo_revcolor="#f8d3a9"
                     /ApEinfo_fwdcolor="#f8d3a9"
     misc_feature    3963..3981
                     /label="Hairpin (less likely) - terminator"
                     /ApEinfo_revcolor="#d6b295"
                     /ApEinfo_fwdcolor="#d6b295"
     misc_feature    4068..4196
                     /label="BBa_B0015"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     misc_feature    4224..4257
                     /label="FRT"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     misc_feature    complement(4263..4280)
                     /label="pr"
                     /ApEinfo_revcolor="#b1ff67"
                     /ApEinfo_fwdcolor="#b1ff67"
     primer          4277..4298
                     /label="oLAC18"
                     /note="sequence: AGGAcaagagacaggatactag"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    4281..4304
                     /label="pf"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    4302..4307
                     /label="RBS"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     CDS             4313..5023
                     /label="mOrange2"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
                     /note="product: photostable monomeric orange derivative of DsRed fluorescent protein (Shaner et al., 2008) note: mammalian codon-optimized translation: MVSKGEENNMAIIKEFMRFKVRMEGSVNGHEFEIEGEGEGRPYEGFQTAKLKVTKGGPLPFAWDILSPHFTYGSKAYVKHPADIPDYFKLSFPEGFKWERVMNYEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGKIKMRLKLKDGGHYTSEVKTTYKAKKPVQLPGAYIVDIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYK"
     primer          4505..4523
                     /label="oLAC13"
                     /note="sequence: atcctgtcccctcatttca"
                     /ApEinfo_revcolor="#84b0dc"
                     /ApEinfo_fwdcolor="#84b0dc"
     primer          4731..4750
                     /label="oLAC11"
                     /note="sequence: tgatgcagaagaagaccatg"
                     /ApEinfo_revcolor="#84b0dc"
                     /ApEinfo_fwdcolor="#84b0dc"
     misc_feature    complement(5011..5037)
                     /label="pr"
                     /ApEinfo_revcolor="#d59687"
                     /ApEinfo_fwdcolor="#d59687"
     primer          5024..5044
                     /label="oLAC26"
                     /note="sequence: taattagctgagctCTTCACG"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     misc_feature    5058..5085
                     /label="pf"
                     /ApEinfo_revcolor="#75c6a9"
                     /ApEinfo_fwdcolor="#75c6a9"
     misc_feature    5065..5070
                     /label="RBS"
                     /ApEinfo_revcolor="#c6c9d1"
                     /ApEinfo_fwdcolor="#c6c9d1"
     misc_feature    5070..5075
                     /label="RBS"
                     /ApEinfo_revcolor="#f8d3a9"
                     /ApEinfo_fwdcolor="#f8d3a9"
     CDS             5081..5875
                     /label="KanR"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    5857..5879
                     /label="KS445\"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     misc_feature    complement(5858..5875)
                     /label="ACM205\"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    complement(5860..5882)
                     /label="pr"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     primer          5883..5901
                     /label="oLAC24"
                     /note="sequence: CAAACTGGGGCACAGATAG"
                     /ApEinfo_revcolor="#b4abac"
                     /ApEinfo_fwdcolor="#b4abac"
     misc_feature    5883..5903
                     /label="pf"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     primer          complement(5895..5913)
                     /label="oLAC28"
                     /note="sequence: GGGATGGGTACCCTATCTG"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     misc_feature    5932..6154
                     /label="From pJA02"
                     /ApEinfo_revcolor="#ffef86"
                     /ApEinfo_fwdcolor="#ffef86"
     misc_feature    5993..5993
                     /label="C->T"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    5999..6127
                     /label="BBa_B0015"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
     misc_feature    6155..7290
                     /label="lacZ downstream"
                     /ApEinfo_revcolor="#ff9ccd"
                     /ApEinfo_fwdcolor="#ff9ccd"
     misc_feature    complement(6165..6185)
                     /label="p2"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     primer          complement(6265..6282)
                     /label="oLAC27"
                     /note="sequence: AGTTGGCTCTTGCTTTGG"
                     /ApEinfo_revcolor="#d59687"
                     /ApEinfo_fwdcolor="#d59687"
     primer          6293..6312
                     /label="oLAC25"
                     /note="sequence: TTCTGCCATCAGTTGGATTG"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"
     primer          6492..6509
                     /label="oQJ39"
                     /note="sequence: AGGTTAGTCATCGCCTGT"
                     /ApEinfo_revcolor="#85dae9"
                     /ApEinfo_fwdcolor="#85dae9"
     primer          6969..6987
                     /label="oLAC16"
                     /note="sequence: GGATCAGAAACATCGCTGA"
                     /ApEinfo_revcolor="#f58a5e"
                     /ApEinfo_fwdcolor="#f58a5e"
     primer          complement(7270..7290)
                     /label="oQJ71"
                     /note="sequence: TGTTGAACAAGCGTCATGGCT"
                     /ApEinfo_revcolor="#f7977a"
                     /ApEinfo_fwdcolor="#f7977a"
     primer          complement(7270..7290)
                     /label="oQJ71"
                     /note="sequence: TGTTGAACAAGCGTCATGGCT"
                     /ApEinfo_revcolor="#c7b0e3"
                     /ApEinfo_fwdcolor="#c7b0e3"
     misc_feature    7431..8291
                     /label="AmpR"
                     /ApEinfo_revcolor="#b7e6d7"
                     /ApEinfo_fwdcolor="#b7e6d7"
ORIGIN
        1 GTTATTTCTT GATGTCTCTG ACCAGACACC CATCAACAGT ATTATTTTCT CCCATGAAGA
       61 CGGTACGCGA CTGGGCGTGG AGCATCTGGT CGCATTGGGT CACCAGCAAA TCGCGCTGTT
      121 AGCGGGCCCA TTAAGTTCTG TCTCGGCGCG TCTGCGTCTG GCTGGCTGGC ATAAATATCT
      181 CACTCGCAAT CAAATTCAGC CGATAGCGGA ACGGGAAGGC GACTGGAGTG CCATGTCCGG
      241 TTTTCAACAA ACCATGCAAA TGCTGAATGA GGGCATCGTT CCCACTGCGA TGCTGGTTGC
      301 CAACGATCAG ATGGCGCTGG GCGCAATGCG CGCCATTACC GAGTCCGGGC TGCGCGTTGG
      361 TGCGGATATC TCGGTAGTGG GATACGACGA TACCGAAGAC AGCTCATGTT ATATCCCGCC
      421 GTCAACCACC ATCAAACAGG ATTTTCGCCT GCTGGGGCAA ACCAGCGTGG ACCGCTTGCT
      481 GCAACTCTCT CAGGGCCAGG CGGTGAAGGG CAATCAGCTG TTGCCCGTCT CACTGGTGAA
      541 AAGAAAAACC ACCCTGGCGC CCAATACGCA AACCGCCTCT CCCCGCGCGT TGGCCGATTC
      601 ATTAATGCAG CTGGCACGAC CGCGTAGCCT GTCAGAAATT GATCGGTCGA TAGGCTGTCA
      661 CCTAAGGCAT TTTGCAGCAA GGGAAGAACC ACATGACCGC CACCAAACAC TAAGCTTCCT
      721 GCTTGGAAGA AATGGCCAAA CAACTCAACC AGCGGTGAGC TTGCAGCCAA GAGCGGCAGC
      781 CCGAGTAATA AACTTGCAAA AAGTAGCAGT GAAAGCCAAG AGGGGCTGAA CGTGGTTGTC
      841 GAAAATGACT GTTGTGGTGC CAAGCGGGCT TGGCCAACGA AAGCGGCGAT CAGAAGGACC
      901 GCAAACTGAG TGATGAGACC GGGAGCTAAC GTAATCGCAA CCGCAGTCAG AACACATAAC
      961 CCTGCAGTAA GGCGTTGCTG ACAAAAATTG CGATACATGG TTAAACAAGC ATCGGCCACC
     1021 ACTATGATCG CGAGTAGCTT CAAGCCATGA ATCATTTGTT CAAACAGTGG GGTATCAAGG
     1081 AGATGGCTGC TTAGCCCCGC GAGCAGCAGC ATGATGAGCA CGGAGGGAAG GGTAAAACCG
     1141 AGAAACGCTG CCCAAGCGCC AGCCAGACCA CCACGATGAT AACCAATCGC AAAACCAACT
     1201 TGGCTAGAAC CGGGGCCAGG AAGGAACTGG CTCAGTGCCA CAAATTGTGC ATATTCTTGC
     1261 TCGCTAACCC AGCGTAATTT CTCAACAAAG GTGTGGCGAA AGTAACCTAT ATGTGCGGCT
     1321 GGCCCACCAA AACTCACCCA TCCGAGAGCA AAAAAAGTTC TAAAAATCGT TAGCATAATG
     1381 ATCTGAAGTC ATCCGTAATC AATGGAAGGT CAACATCCGT AGGAGCATAG GTTATGGAGA
     1441 GTCAAAGCGC AGAACAACTC CGAATGTGTA AAAAATTACA CCATTATGAT GGCAATCGTA
     1501 TGAATCGATT CAGAAATAGA AAAATTGGGT CAATATCGAC CTCTATTTAA ATTGTGGAAA
     1561 CGTTTACACA ATTGGTGAGT GGTTCACAGA ATCGGTGTTT GAAAGTTTGT TAGACTTTTT
     1621 TGCATCTGCA GCATGTCATC ATTCCTATTC AAAGCTGCGA ATCTTATTGA ATGACTTCTT
     1681 TACTCCTCGG CTTGAGGGtt gacagctagc tcagtcctag gtataatgct agcgaagttc
     1741 ctattctcta gaaagtatag gaacttcCGA AGCTGGGGAT CCGTTTGATT TTTAATGGAT
     1801 AATGTGATAT AATCTTTAAA TACTGTAGAA AAGAGGAAGG AAATAATAAA TGGCTAAAAT
     1861 GAGAATATCA CCGGAATTGA AAAAACTGAT CGAAAAATAC CGCTGCGTAA AAGATACGGA
     1921 AGGAATGTCT CCTGCTAAGG TATATAAGCT GGTGGGAGAA AATGAAAACC TATATTTAAA
     1981 AATGACGGAC AGCCGGTATA AAGGGACCAC CTATGATGTG GAACGGGAAA AGGACATGAT
     2041 GCTATGGCTG GAAGGAAAGC TGCCTGTTCC AAAGGTCCTG CACTTTGAAC GGCATGATGG
     2101 CTGGAGCAAT CTGCTCATGA GTGAGGCCGA TGGCGTCCTT TGCTCGGAAG AGTATGAAGA
     2161 TGAACAAAGC CCTGAAAAGA TTATCGAGCT GTATGCGGAG TGCATCAGGC TCTTTCACTC
     2221 CATCGACATA TCGGATTGTC CCTATACGAA TAGCTTAGAC AGCCGCTTAG CCGAATTGGA
     2281 TTACTTACTG AATAACGATC TGGCCGATGT GGATTGCGAA AACTGGGAAG AAGACACTCC
     2341 ATTTAAAGAT CCGCGCGAGC TGTATGATTT TTTAAAGACG GAAAAGCCCG AAGAGGAACT
     2401 TGTCTTTTCC CACGGCGACC TGGGAGACAG CAACATCTTT GTGAAAGATG GCAAAGTAAG
     2461 TGGCTTTATT GATCTTGGGA GAAGCGGCAG GGCGGACAAG TGGTATGACA TTGCCTTCTG
     2521 CGTCCGGTCG ATCAGGGAGG ATATCGGGGA AGAACAGTAT GTCGAGCTAT TTTTTGACTT
     2581 ACTGGGGATC AAGCCTGATT GGGAGAAAAT AAAATATTAT ATTTTACTGG ATGAATTGTT
     2641 TTAGGACGTC GCCGGCGGCA TCAAATAAAA CGAAAGGCTC AGTCGAAAGA CTGGGCCTTT
     2701 CGTTTTATCT GTTGTTTGTC GGTGAACGCT CTCCTGAGTC CCACAATAAG CCAGAGAGCC
     2761 GGTGTCAACG TAAATGCATG CCGCTTCGCC TTCGCGCGCG AATTGCAAGC TGATCCGGGC
     2821 TTATCGACTG CACGGTGCAC CAATGCTTCT GGCGTCAGGC AGCCATCGGA AGCTGTGGTA
     2881 TGGCTGTGCA GGTCGTAAAT CACTGCATAA TTCGTGTCGC TCAAGGCGCA CTCCCGTTCT
     2941 GGATAATGTT TTTTGCGCCG ACATCATAAC GGTTCTGGCA AATATTCTGA AATGAGCTGT
     3001 TGACAATTAA TCATCGGCTC GTATAATGTG TGGAATTGTG AGCGGATAAC AATTTCACAC
     3061 AGGAAACAGC CTCGACAGGC CTAGGAATTC AGGgaagttc ctattctcta gaaagtatag
     3121 gaacttcAGT GTGGGGTCTG CCTCGACAGG CCTAGGAATT CAGGAGCTAA GGAAGCTAAA
     3181 ATGGTGAGCA AGGGCGAGGA GACCACAATG GGCGTAATCA AGCCCGACAT GAAGATCAAG
     3241 CTGAAGATGG AGGGCAACGT GAATGGCCAC GCCTTCGTGA TCGAGGGCGA GGGCGAGGGC
     3301 AAGCCCTACG ACGGCACCAA CACCATCAAC CTGGAGGTGA AGGAGGGAGC CCCCCTGCCC
     3361 TTCTCCTACG ACATTCTGAC CACCGCGTTC GCCTACGGCA ACAGGGCCTT CACCAAGTAC
     3421 CCCGACGACA TCCCCAACTA CTTCAAGCAG TCCTTCCCCG AGGGCTACTC TTGGGAGCGC
     3481 ACCATGACCT TCGAGGACAA GGGCATCGTG AAGGTGAAGT CCGACATCTC CATGGAGGAG
     3541 GACTCCTTCA TCTACGAGAT ACACCTCAAG GGCGAGAACT TCCCCCCCAA CGGCCCCGTG
     3601 ATGCAGAAGA AGACCACCGG CTGGGACGCC TCCACCGAGA GGATGTACGT GCGCGACGGC
     3661 GTGCTGAAGG GCGACGTCAA GCACAAGCTG CTGCTGGAGG GCGGCGGCCA CCACCGCGTT
     3721 GACTTCAAGA CCATCTACAG GGCCAAGAAG GCGGTGAAGC TGCCCGACTA TCACTTTGTG
     3781 GACCACCGCA TCGAGATCCT GAACCACGAC AAGGACTACA ACAAGGTGAC CGTTTACGAG
     3841 AGCGCCGTGG CCCGCAACTC CACCGACGGC ATGGACGAGC TGTACAAGTA AGGATCCGGT
     3901 GATTGATTGA GCAAGCTTTA TGCTTGTAAA CCGTTTTGTG AAAAAATTTT TAAAATAAAA
     3961 AAGGGGACCT CTAGGGTCCC CAATTACTAG CCAGCGGCCG GCTGTTTTGG CGGATGAGAG
     4021 AAGATTTTCA GCCTGATACA GATTAAATCA GAACGCAGAA GCGGTCTcca ggcatcaaat
     4081 aaaacgaaag gctcagtcga aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa
     4141 cgctctctac tagagtcaca ctggctcacc ttcgggtggg cctttctgcg tttataGATA
     4201 AAACAGAATT TGCCTGGCGG CAGgaagttc ctattctcta gaaagtatag gaacttcAAA
     4261 CAGCCTCGAC AGGCCTAGGA caagagacag gatactagtg gaggaagaaa aaatggtgag
     4321 caagggcgag gagaataaca tggccatcat caaggagttc atgcgcttca aggtgcgcat
     4381 ggagggctcc gtgaacggcc acgagttcga gatcgagggc gagggcgagg gccgccccta
     4441 cgagggcttt cagaccgcta agctgaaggt gaccaagggt ggccccctgc ccttcgcctg
     4501 ggacatcctg tcccctcatt tcacctacgg ctccaaggcc tacgtgaagc accccgccga
     4561 catccccgac tacttcaagc tgtccttccc cgagggcttc aagtgggagc gcgtgatgaa
     4621 ctacgaggac ggcggcgtgg tgaccgtgac ccaggactcc tccctgcagg acggcgagtt
     4681 catctacaag gtgaagctgc gcggcaccaa cttcccctcc gacggccccg tgatgcagaa
     4741 gaagaccatg ggctgggagg cctcctccga gcggatgtac cccgaggacg gtgccctgaa
     4801 gggcaagatc aagatgaggc tgaagctgaa ggacggcggc cactacacct ccgaggtcaa
     4861 gaccacctac aaggccaaga agcccgtgca gctgcccggc gcctacatcg tcgacatcaa
     4921 gttggacatc acctcccaca acgaggacta caccatcgtg gaacagtacg aacgcgccga
     4981 gggccgccac tccaccggcg gcatggacga gctgtacaag taataattag ctgagctCTT
     5041 CACGTGAGAC GTCAACCAGA AAAGAGGAAG GAAATAATAA ATGGCTAAAA TGAGAATATC
     5101 ACCGGAATTG AAAAAACTGA TCGAAAAATA CCGCTGCGTA AAAGATACGG AAGGAATGTC
     5161 TCCTGCTAAG GTATATAAGC TGGTGGGAGA AAATGAAAAC CTATATTTAA AAATGACGGA
     5221 CAGCCGGTAT AAAGGGACCA CCTATGATGT GGAACGGGAA AAGGACATGA TGCTATGGCT
     5281 GGAAGGAAAG CTGCCTGTTC CAAAGGTCCT GCACTTTGAA CGGCATGATG GCTGGAGCAA
     5341 TCTGCTCATG AGTGAGGCCG ATGGCGTCCT TTGCTCGGAA GAGTATGAAG ATGAACAAAG
     5401 CCCTGAAAAG ATTATCGAGC TGTATGCGGA GTGCATCAGG CTCTTTCACT CCATCGACAT
     5461 ATCGGATTGT CCCTATACGA ATAGCTTAGA CAGCCGCTTA GCCGAATTGG ATTACTTACT
     5521 GAATAACGAT CTGGCCGATG TGGATTGCGA AAACTGGGAA GAAGACACTC CATTTAAAGA
     5581 TCCGCGCGAG CTGTATGATT TTTTAAAGAC GGAAAAGCCC GAAGAGGAAC TTGTCTTTTC
     5641 CCACGGCGAC CTGGGAGACA GCAACATCTT TGTGAAAGAT GGCAAAGTAA GTGGCTTTAT
     5701 TGATCTTGGG AGAAGCGGCA GGGCGGACAA GTGGTATGAC ATTGCCTTCT GCGTCCGGTC
     5761 GATCAGGGAG GATATCGGGG AAGAACAGTA TGTCGAGCTA TTTTTTGACT TACTGGGGAT
     5821 CAAGCCTGAT TGGGAGAAAA TAAAATATTA TATTTTACTG GATGAATTGT TTTAGGACGT
     5881 CGCAAACTGG GGCACAGATA GGGTACCCAT CCCTCAAGCC GAGCCCCATG CGCTGTTTTG
     5941 GCGGATGAGA GAAGATTTTC AGCCTGATAC AGATTAAATC AGAACGCAGA AGCGGTCTcc
     6001 aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt tatctgttgt
     6061 ttgtcggtga acgctctcta ctagagtcac actggctcac cttcgggtgg gcctttctgc
     6121 gtttataGAT AAAACAGAAT TTGCCTGGCG GCAGCCCACA ATAAGCCAGA GAGCCTTAAG
     6181 GCTCTCTTTT TTGTGCCACG CTGTCGCGGC GAACGATGGT GGGCGAAAAG ATCATCGGAT
     6241 CATTTTCACG CGGTGCATCG GTTGCCAAAG CAAGAGCCAA CTGAGTGGCT TTTTCTGCCA
     6301 TCAGTTGGAT TGGGTAACGC ACGGTCGTCA GGCGTGGGCG TAAGTAGCGC GAAATCAGGG
     6361 CATCGTCAAA ACCAATCACT GAGACATGTT CAGGAACCGA ATGCCCATTT TCCTCAAGTA
     6421 CCAACAGGGC TCCAGCGGCC ATATTGTCGT TGTAAGCGAC CACTGCGGTA AAAGGCAGTG
     6481 ATTTCACCAG TAGGTTAGTC ATCGCCTGTT CGCCGCCTTC ACTGTCTGGG CTGGCTTTTT
     6541 CAATATAGCT GCTGCTTAGG GTGATGCCAT GTTCGTTCAA GGCTTGTTGA TAGCCCGCAA
     6601 TACGCTGATC GGCATCCTCA ATTTGATGAG AGGAGCTGAT GCAAGCAATG TTTCTATGTC
     6661 CTTGACGGAT GAGAAAATCG GTTGCTAAAT AAGCACCTTT CTGGTTGTCG AGTGAAATAC
     6721 AGCGATCGGC GAGCTGAGGA ATATGGCGAT TGATCAACAC CAGCGTTTTT ACCTCGTTGG
     6781 CGTACTCAAT CAGCTCTTCA TTGGGAAGCG CTTTGGAGTG GATGACCAGT GCATCACAGC
     6841 GGCTGTTGAT GAGCAGTTCC AGCGCGCGGC GCTCTTCTTC CGCTCGGTGG TAGCCGTTGC
     6901 CGATCAGCAA ATGTTTTCCT TCGCGGTGGG CGACAGTGTC GACGGCTTTG ACTAAAGTTC
     6961 CAAAGAAGGG ATCAGAAACA TCGCTGACCA AGACCCCTAT GGTATTGGTG CTTTGATTGA
     7021 CTAAAGCGCG AGCCGCGGCG TTCGGACGAT AGCCGAGTTT GTGCATCGCA CTGGTCACCG
     7081 TATCAATCGA GGCTTGGCTG GCTTTCGGGG ATTTGTTGAT GACGCGTGAA ACCGTTGCCA
     7141 CAGATACACC GGCTTCACGT GCTACGTCTT TTATGGTTGC CATAAGATTC CTTCTCTATC
     7201 ACAGGCGCAA TAGTAGCGCT CCCTGTGAAA ACAGCGCAAT TGTAACTGAG TACATAAGAG
     7261 TGAATTGTGA GCCATGACGC TTGTTCAACA CAGAAGGCCA TCCTGACGGA TGGCCTTTTT
     7321 GCGTTTCTAC AAACTCTTTT TGTTTATTTT TCTAAATACA TTCAAATATG TATCCGCTCA
     7381 TGAGACAATA ACCCTGATAA ATGCTTCAAT AATATTGAAA AAGGAAGAGT ATGAGTATTC
     7441 AACATTTCCG TGTCGCCCTT ATTCCCTTTT TTGCGGCATT TTGCCTTCCT GTTTTTGCTC
     7501 ACCCAGAAAC GCTGGTGAAA GTAAAAGATG CTGAAGATCA GTTGGGTGCA CGAGTGGGTT
     7561 ACATCGAACT GGATCTCAAC AGCGGTAAGA TCCTTGAGAG TTTTCGCCCC GAAGAACGTT
     7621 TTCCAATGAT GAGCACTTTT AAAGTTCTGC TATGTGGCGC GGTATTATCC CGTGTTGACG
     7681 CCGGGCAAGA GCAACTCGGT CGCCGCATAC ACTATTCTCA GAATGACTTG GTTGAGTACT
     7741 CACCAGTCAC AGAAAAGCAT CTTACGGATG GCATGACAGT AAGAGAATTA TGCAGTGCTG
     7801 CCATAACCAT GAGTGATAAC ACTGCGGCCA ACTTACTTCT GACAACGATC GGAGGACCGA
     7861 AGGAGCTAAC CGCTTTTTTG CACAACATGG GGGATCATGT AACTCGCCTT GATCGTTGGG
     7921 AACCGGAGCT GAATGAAGCC ATACCAAACG ACGAGCGTGA CACCACGATG CCTACAGCAA
     7981 TGGCAACAAC GTTGCGCAAA CTATTAACTG GCGAACTACT TACTCTAGCT TCCCGGCAAC
     8041 AATTAATAGA CTGGATGGAG GCGGATAAAG TTGCAGGACC ACTTCTGCGC TCGGCCCTTC
     8101 CGGCTGGCTG GTTTATTGCT GATAAATCTG GAGCCGGTGA GCGTGGGTCT CGCGGTATCA
     8161 TTGCAGCACT GGGGCCAGAT GGTAAGCCCT CCCGTATCGT AGTTATCTAC ACGACGGGGA
     8221 GTCAGGCAAC TATGGATGAA CGAAATAGAC AGATCGCTGA GATAGGTGCC TCACTGATTA
     8281 AGCATTGGTA ACTGTCAGAC CAAGTTTACT CATATATACT TTAGATTGAT TTAAAACTTC
     8341 ATTTTTAATT TAAAAGGATC TAGGTGAAGA TCCTTTTTGA TAATCTCATG ACCAAAATCC
     8401 CTTAACGTGA GTTTTCGTTC CACTGAGCGT CAGACCCCGT AGAAAAGATC AAAGGATCTT
     8461 CTTGAGATCC TTTTTTTCTG CGCGTAATCT GCTGCTTGCA AACAAAAAAA CCACCGCTAC
     8521 CAGCGGTGGT TTGTTTGCCG GATCAAGAGC TACCAACTCT TTTTCCGAAG GTAACTGGCT
     8581 TCAGCAGAGC GCAGATACCA AATACTGTCC TTCTAGTGTA GCCGTAGTTA GGCCACCACT
     8641 TCAAGAACTC TGTAGCACCG CCTACATACC TCGCTCTGCT AATCCTGTTA CCAGTGGCTG
     8701 CTGCCAGTGG CGATAAGTCG TGTCTTACCG GGTTGGACTC AAGACGATAG TTACCGGATA
     8761 AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT CGTGCACACA GCCCAGCTTG GAGCGAACGA
     8821 CCTACACCGA ACTGAGATAC CTACAGCGTG AGCTATGAGA AAGCGCCACG CTTCCCGAAG
     8881 GGAGAAAGGC GGACAGGTAT CCGGTAAGCG GCAGGGTCGG AACAGGAGAG CGCACGAGGG
     8941 AGCTTCCAGG GGGAAACGCC TGGTATCTTT ATAGTCCTGT CGGGTTTCGC CACCTCTGAC
     9001 TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG CCTATGGAAA AACGCCAGCA
     9061 ACGCGGCCTT TTTACGGTTC CTGGCCTTTT GCTGGCCTTT TGCTCACATG TTCTTTCCTG
     9121 CGTTATCCCC TGATTCTGTG GATAACCGTA TTACCGCCTT TGAGTGAGCT GATACCGCTC
     9181 GCCGCAGCCG AACGACCGAG CGCAGCGAGT CAGTGAGCGA GGAAGCGGAA GAGCGCCTGA
     9241 TGCGGTATTT TCTCCTTACG CATCTGTGCG GTATTTCACA CCGCATATGG TGCACTCTCA
     9301 GTACAATCTG CTCTGATGCC GCATAGTTAA GCCAGTATAC ACTCCGCTAT CGCTACGTGA
     9361 CTGGGTCATG GCTGCGCCCC GACACCCGCC AACACCCGCT GACGCGCCCT GACGGGCTTG
     9421 TCTGCTCCCG GCATCCGCTT ACAGACAAGC TGTGACCGTC TCCGGGAGCT GCATGTGTCA
     9481 GAGGTTTTCA CCGTCATCAC CGAAACGCGC GAGGCAGCAG ATCAATTCGC GCGCGAAGGC
     9541 GAAGCGGCAT GCATTTACGT TGACACCATC GAATGGTGCA AAACCTTTCG CGGTATGGCA
     9601 TGATAGCGCC CGGAAGAGAG TCAATTCAGG GTGGTGAATG TGAAACCAGT AACGTTATAC
     9661 GATGTCGCAG AGTATGCCGG TGTCTCTTAT CAGACCGTTT CCCGCGTGGT GAACCAGGCC
     9721 AGCCACGTTT CTGCGAAAAC GCGGGAAAAA GTGGAAGCGG CGATGGCGGA GCTGAATTAC
     9781 ATTCCCAACC GCGTGGCACA ACAACTGGCG GGCAAACAGT CGTTGCTGAT TGGCGTTGCC
     9841 ACCTCCAGTC TGGCCCTGCA CGCGCCGTCG CAAATTGTCG CGGCGATTAA ATCTCGCGCC
     9901 GATCAACTGG GTGCCAGCGT GGTGGTGTCG ATGGTAGAAC GAAGCGGCGT CGAAGCCTGT
     9961 AAAGCGGCGG TGCACAATCT TCTCGCGCAA CGCGTCAGTG GGCTGATCAT TAACTATCCG
    10021 CTGGATGACC AGGATGCCAT TGCTGTGGAA GCTGCCTGCA CTAATGTTCC GGC
//
kdyrhage commented 1 year ago

It works fine for me on v0.3.2 and master, after copying the contents to a file. Could you upload the actual problematic file somewhere?

adityanprasad commented 1 year ago

Thanks for such a quick response. Sorry, I updated my version of GenomicAnnotations and it reads the file now. However, there's still a problem when trying to access genes. I think it's some issue with printing to stdout. For instance,

plasmid = readgbk("file.gb")[1]
plasmid.genes

will run indefinitely and not return anything, but

for gene in pAC03.genes
    @show "$(plasmid.name)_$(gene.locus_tag)"
end

will return almost immediately. Similarly,

@genes(plasmid, ismissing(:gene))

will get stuck but

@genes(plasmid, ismissing(:gene)) |> length

works. Hopefully, you're able to replicate this behavior.

EDIT:

for gene in pAC03.genes
    @show gene
end

gets stuck at

gene =      primer          4277..4298
                     /label="oLAC18"
                     /note="sequence: AGGAcaagagacaggatactag"
                     /ApEinfo_revcolor="#faac61"
                     /ApEinfo_fwdcolor="#faac61"

Is there something wrong with the format of the .gb file at or after this primer annotation?

kdyrhage commented 1 year ago
                     /note="product: photostable monomeric orange derivative of DsRed fluorescent protein (Shaner et al., 2008) note: mammalian codon-optimized translation: MVSKGEENNMAIIKEFMRFKVRMEGSVNGHEFEIEGEGEGRPYEGFQTAKLKVTKGGPLPFAWDILSPHFTYGSKAYVKHPADIPDYFKLSFPEGFKWERVMNYEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGKIKMRLKLKDGGHYTSEVKTTYKAKKPVQLPGAYIVDIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYK"

This long line was causing problems with the printing. I've included a fix in v0.3.3 and registered it, until it is merged you can use the master branch. Thanks for the help!

adityanprasad commented 1 year ago

Thanks for such quick responses. Really appreciate your work in this package