heuermh / sars-cov-2

SARS-CoV-2 (Severe acute respiratory syndrome coronavirus 2) sequences and features
Apache License 2.0
1 stars 1 forks source link

CompoundNotFoundException: Cannot find compound for: L #1

Closed heuermh closed 4 years ago

heuermh commented 4 years ago
  org.biojava.nbio.core.exceptions.CompoundNotFoundException: Cannot find compound for: L
    at org.biojava.nbio.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:207)
    at org.biojava.nbio.core.sequence.template.AbstractSequence.<init>(AbstractSequence.java:93)
    at org.biojava.nbio.core.sequence.DNASequence.<init>(DNASequence.java:84)
    at org.biojava.nbio.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:64)
    at org.biojava.nbio.core.sequence.io.GenbankReader.process(GenbankReader.java:165)
    at org.biojava.nbio.core.sequence.io.GenbankReader.process(GenbankReader.java:122)
    at org.biojava.nbio.adam.BiojavaAdamContext.readGenbankDna(BiojavaAdamContext.scala:472)
    at org.biojava.nbio.adam.BiojavaAdamContext.$anonfun$loadBiojavaGenbankDna$3(BiojavaAdamContext.scala:255)
    at org.biojava.nbio.adam.TryWith$.$anonfun$apply$1(TryWith.scala:37)
    at scala.util.Success.flatMap(Try.scala:251)
    at org.biojava.nbio.adam.TryWith$.apply(TryWith.scala:35)
    at org.biojava.nbio.adam.BiojavaAdamContext.loadBiojavaGenbankDna(BiojavaAdamContext.scala:254)
    at org.biojava.nbio.adam.BiojavaAdamContext.loadGenbankDna(BiojavaAdamContext.scala:242)
heuermh commented 4 years ago

See upstream https://github.com/biojava/biojava-adam/issues/34

heuermh commented 4 years ago

Due to broken efetch download, incomplete genbank record:

     CDS             join(224..13426,13426..21513)
                     /gene="ORF1ab"
                     /ribosomal_slippage
                     /codon_start=1
                     /product="ORF1ab polyprotein"
                     /protein_id="QLF98153.1"
                     /translation="MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQ
                     HLKDGTCGLVEVEKGVLPQLEQPYVFIKRSDARTAPHGHVMVELVAELEGIQYGRSGE
                     TLGVLVPHVGEIPVAYRKVLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQEN
                     WNTKHSSGVTRELMRELNGGAYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQ
                     LDFIDTKRGVYCCREHEHEIAWYTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFP
                     LNSIIKTIQPRVEKKKLDGFMGRIRSVYPVASPNECNQMCLSTLMKCDHCGETSWQTG
                     DFVKATCEFCGTENLTKEGATTCGYLPQNAVVKIYCPACHNSEVGPEHSLAEYHNESG
                     LKTILRKGGRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEGLNDNL
                     LEILQKEKVNINIVGDFKLNEEIAIILASFSASTSAFVETVKGLDYKAFKQIVESCGN
                     FKVTKGKAKKGAWNIGEQKSILSPLYAFASEAARVVRSIFSRTLETAQNSVRVLQKAA
                     ITILDGISQYSLRLIDAMMFTSDLATNNLVVMAYITGGVVQLTSQWLTNIFGTVYEKL
                     KPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIVGGQIVTCAKEIKESVPTFFKLV
                     NKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKCVKSREETGLLMPLKAPKEII
                     FLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVGTPVCINGLMLLEIKDTEK
                     YCALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVNITFELDERIDKVLNEK
                     CSAYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLDEWSMATYYLFDESGEF
                     KLASHMYCSFYPPDEDEEEGDCEEEEFEPSTQYEYGTEDDYQGKPLEFGATSAALQPE
                     EEQEEDWLDDDSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTPVVQTIEVNSFSG
                     YLKLTDNVYIKNADIVEEAKKVKPTVVVNAANVYLKHGGGVAGALNKATNNAMQVESD
                     DYIATNGPLKVGGSCVLSGHNLAKHCLHVVGPNVNKGEDIQLLKSAYENFNQHEVLLA
                     PLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLEMKSEKQVEQKIA
                     EIPKEEVKPFITESKPSVEQRKQDDKKIKACVEEVTTTLEETKFLTENLLLYIDINGN
                     LHPDSATLVSDIDITFLKKDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKV
                     PTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSIISNEKQEILGTVSWNLREM
                     LAHAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDYGARFYFYTSKTTVASLIN
                     TLNDLNETL
LOCUS       MT506216               29814 bp    RNA     linear   VRL 22-MAY-2020
DEFINITION  Severe acute respiratory syndrome coronavirus 2 isolate
            SARS-CoV-2/human/USA/MI-MDHHS-SC20376/2020 ORF1ab polyprotein
            (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S),
            ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein
            (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b),
            ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10
            protein (ORF10) genes, complete cds.