VIOLINet / Vaxign-ML-docker

GNU General Public License v3.0
7 stars 3 forks source link

(<class 'KeyError'>, KeyError('NP_212213.1 hypothetical protein BB0079 [Borrelia burgdorferi B31]',), <traceback object at 0x7f7a6b2d68c8>) #2

Open abrozzi opened 2 years ago

abrozzi commented 2 years ago

My dear friend Edison,

when I run ./Train.sh I get this error:

VaxML % ./Train.sh /Users/ab/smallPOS.fasta /Users/ab/smallNEG.fasta /Users/ab/out g-  
Saving results to /20220607144338_psortb_gramneg.txt
Saving results to /20220607144440_psortb_gramneg.txt
(<class 'KeyError'>, KeyError('NP_212213.1 hypothetical protein BB0079 [Borrelia burgdorferi B31]',), <traceback object at 0x7f7a6b2d68c8>)

Do you know if I am missing something?

I copy/paste here the two files I used:

SMALL NEG FASTA

NP_212213.1 hypothetical protein BB0079 [Borrelia burgdorferi B31] MNIDYFIKNKILHNSGINYQVKTKNLNPSNKLIEKIKAVNPKIKAKTWNEYNKEFYKILKIERNTMLIIL ASIFIVIAVNTYYLQKRIIINKNKAILILLAMGLRIKKIKQIFFIHSIIICTVGGLLGLTLGISISLNIN EILKIIDNLVNTLINFLNQILALKIDGIKIQIVKNTITPKLFLSDLTFTFCFACFSTMYSSMKATKKIGS QKNIETINGQ

NP_212370.1 hypothetical protein BB_0236 [Borreliella burgdorferi B31] MLIFGFIGLFFLNIFSLHAQGIVTNKDAQEEFKWALNSYNNGIYDDALLSFKKILSFDPNNLDYHFWTGN VYYRLGYVEEALMEWRNLKDQGYKVPYLRHLISTIEQRRGIFSNYELNFKKLVKVASLDNSIYKRPHGYQ ITSLRADKYGGYYAANFVGNEILYFDVNNNVNALVKDGFSYLKSPYDVIEANNLLYVTLYSSDEIGVYDK VLGVKRKSIGNKGTKDGELLAPQYMAIDKRNYIYVSEWGNKRVSKFGLEGDFILHFGSRTSGYKGLLGPT GVTYLNENIYVADSLRNTIEVFDTSGNHLYSVFTSIEGIEGLSSDFVGNNVIVSSKDGVYKYSIAKKTIT KILKADKMNSKISSSILDANNQMIVSDFNNAKVSVYKSDASLYDSLNVDVRRIIRLGGPKIYVELNVSSK SGLPVVGLKSENFSISNENYYIVNPKVAYNVNASKDINIAVVFDKSSYMKKYDTDQIVGLNALMELSKNK NFSFINATSVPIIDNIESLTNSIRNTSSLGPYSTDAVKTDVSLKLAGSGLMSKSSRRAVVYFSGGILNRK AFEKYSLDTIVSYYKNNDIRFYLILFGNDPINSKLQYLVNETGGAVIPFSSYEGVSKVYDLILEQKTGTY LLEYYYPGPQEPNKYFNLSVEANINQQTGRGEFAYFIN

NP_212396.1 M23 peptidase domain-containing protein [Borreliella burgdorferi B31] MIIPKKKQRVEKRKKNFLFNSKKSVNFELKDFANISNIGKRRKKVFKIKNFFKKKINFFKKVSLFFYKLK IQNINHYEYKYYYKSLRDKVFDIFSVRFDYKLVFKLNAIIFIFILTFYINIFSYYGSYVFLNRLSLPKDY FIDTFLYYSDQDIAQISSYLPESNVSANVPGFKKNFVLKVFDHKIKPGETLSHVAARYQITSETLISFNE IKDVRNIKPNSVIKVPNMKGIVYIVKKNDSISSIASAYNVPKVDILDSNNLDNEVLFLGQKLFIPGGRLP KDFLKEVLGETFIYPVQGVITSGYGYRPDPFTGVISFHNGIDIANLANTPIKASREGVVVTAGFNAGGYG KYIVISHSNGFQTLYAHLNSFAVKVGKKVSRGAVIGYMGSTGYSTGNHLHFTIFKNGKTENPMKYLR

NP_212734.1 hypothetical protein BB_0600 [Borreliella burgdorferi B31] MAIFLKNKYFYLSLIFIIFLFLFVFSGFLFYSKPIIYDISPIPTSHKDIIVIKGNNLGYSTGEININNNY LVKSSIISWNNTEIVFKITDEVNSGLIFVKGERGTSNELFLVISRQVPVKLNRKNIPFIFSEDKIILNAN SSTLLQGMNLFSPFSTITIFLETKDKLYTILPQNILNVSENRVEFVSPKTLNSSGKLYVLLDNIQSNKVP FSVKNDFFKWTLSDFKEFVIIEEIYFSQDVSSNFDSNPQDINFNIFYLRPIENERQKITERNSEHLDFNI DNLFFENLKTNKFIFKTRVKTYKLNLEFLDAKYLESIEVNRDINNQEYKKYVQDKKKDYLSYSYVDLMSL DSLILSKTSGSNSVYKLAKAIIDVLTSNFKIVENNLSLKDSIEEKKISSGNLIVLTNLLFLKYDIPLRNI VGLYYDSNSLKLKEHFWFEFFLAGVGFVYFDIINAVLFKDSSKYFLNISDNYIQYGCKEDYDKNEFFDGY LDSGFLKYKSLTNGSYSLMHRFVLEDNF

NP_212767.1 exodeoxyribonuclease V subunit beta [Borreliella burgdorferi B31] MNKILEKIQNNTTILIEASAGTGKTHILENVVINLIKTKLYSINEILVLTFTKKATEEMHTRILKVIENA YSNSKTNEILKEAYEQSKKLFISTINKFALHALNNFQIETENYSKYKPKEKFSKEIDEIVYDFLRKSDSL IQALDIKDYELKVFKSDAKKTEEIVLKIKKAYERDTTQELGDWLKTQTAFENILLKKEELIKDYNKIIED LDKMTKDEILSFYNKHIQTGKLEIEYSKENDIFKIAETLLKNKFFSTLIEKETKKNSKLSPKELKIKNDL ICLGINIKHEKYKSEDNRNKNRNNLKQYVILKVEYKILKYIEKELKKTIKSTNTIDQNYIISNLKNYLKS EDKKLLNAIKNRYKIILIDEAQDLSLIQIEIFKILKTAGIKLIFIADPKQIIYSFRKADISFYNKEIKNK INTDARIVLKINHRSSKKLIGPLNKIFNNIYNNAIADEIEKIDFTNSLPNQKNDNNKIVINGQEIEGINI ITTNTESEEDIYQKTALTIKYLLAYGKIAENNKIRNIKMQDIKVLCRGKNEINLIDKALKKEQIQTNKTQ EKFLKTKEFSEIFYIIKCLDRKQSFKTLNYILSSKILNVPWNLQRILIKQDKICLIEEFIENIIVLLEKN EITLINAINKITFEKNLWIKIANITKDQKIIEWAKNKINYKGLLIKEGKLENLKTYETTLEIISKIYHKE QNIQSLISTLESLIINEEPEEIEEKINNINNDNESIELMTIHKSKGLGMNIVFLLNTTPIENSNFFSKKN QFYKFYQDGKIEYDFFKLEENKKYARLKILSEEKNIFYVGATRAKFALFIIKINSITSKLLEIAKIFTID DIKHDFNIHEFIGQKRFNKKKYNTNVNTKLIPPKPIIKNMFKKEYTSSFSSLTAQAHHKEFYENYDFKNI NYEKETELDYEPGLEETLPKGKDIGNILHAAMEEIIFSTAKDTFDNFKKNNIEIIEKQIQKINSNLNTIE IQNSLAKMIYNILTYNIRAINTRLCDIEELQKEMEFLIKINPEFQKQKYLFDKHFEDLHIKLSDGYLKGI VDLIFKANNKIYILDYKTNYLGKNKEDYNITNLENTIKKEYYDLQYKIYALGIKKILFKNKKEYNQKFGG IIYLFTRAFEDNIECLKSKFENGIYFNLPKFNDVDLDKIILELGIKRHL

NP_212856.1 hypothetical protein BB0722 [Borrelia burgdorferi B31] MKINKTFILLFLFTKFSFVQAQANQILTEISPLSILSKNGKGSVYLKVSKSSDYILTLDKSSNSDFVFKI YDISNKKYITDKVKRRDFKIRLDKNSLYAIIYVGTKNENIKFSLTDLDFSILSSDSLKAKTSKIEKEDLF FTLKDLPVLNLTAKLKKYVLRIYKSNIYIAYQLENSDDIKVAEFIEDVGWFNLDSSVNRNITNIVNFDFS INSKGNLYIAFVTKSGADFASELIVKKFNSRKWIDISPGHIENFGSLLNISIDLKDRLYLAYLREIRGEY KINLISNMGYGSIWTDVIHAYLSKGDSNVNSSNIGLISEPFLGIFYNYKSNNEIKSEFIVNNENAWVNAN IPSVYMANFIKGFFDSNFNQIIMSFVSENRPIVNICPLKSSRWINISPNVEMEGLSADIGLYKNNLFLAF EDNNNVRLIYFKNKNWYFLNKLENFKSNVKSPQIGIYGNQGLVISTLSSNSNELFFTLICQ

NP_245935.1 hypothetical protein PM0998 [Pasteurella multocida subsp. multocida str. Pm70] MYLISEFIFQFILYEVCMKNKSKLLACCLMALPISSFSIGNNNLIGVGVSAGNSIYQVKKKTAVEPFLML DLSFGNFYMRGAAGLSELGYQHVFTPSFSTSLFLSPFDGAPIKRKDLKPGYDSIQDRKTQVAVGLGLDYD LSDLFNLPNTNISLEMKKGRRGFNSDITLTRTFMLTDKLSISPSFGLSYYSAKYTNYYFGIKKAELNKTK LKSVYHPKKAYSGHIALNSHYAITDHIGMGLSFSWETYSKAIKKSPIVKRSGEISSALNFYYMF

NP_246737.1 hypothetical protein PM1798 [Pasteurella multocida subsp. multocida str. Pm70] MCKRLANIDFHSIINDQCTGLNKELIMLKSLSLISLITLLSACSMSSYVPFMNDKKAVIDLDKTQIDQKS YATAYEATLATYKGRVNQDYDVHSFSSGANDWYLNRILLPIDKIKENLYQGGHDSNIHAYYSGVVFASAL QSNFNKLNPNCWSYLDAPSVTQGIYDAMKDLQKGKQRAEDDPYIVQGSEQLLKRCAQ

NP_206845.1 GDP-D-mannose dehydratase [Helicobacter pylori 26695] MKEKIALITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHKRRFFLHYGDMTDS SNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEKKTRFYQASTSELYG EVLETPQNENTPFNPRSPYAVAKMYAFYITKNYREAYNLFAVNGILFNHESRVRGETFVTRKITRAASAI AYNLTDCLYLGNLDAKRDWGHAKDYVKMMHLMLQAPIPQDYVIATGKTTSVRDFVKMSFEFIGINLEFQN TGIKEIGLIKSVDEKRANALKLNLSHLKKGQIVVRIDERYFRPTEVDLLLGDPTKAEKELDWVREYDLKE LVKDMLEYDLKECQKNLYLQDGGYILRNFYE

NP_206910.1 heat shock protein GrpE [Helicobacter pylori 26695] MKDEHNQEHDHLSQKEPEFCEKACKEQQYEEKQEAGEKEGEIKEDFELKYKEMHEKYLRVHADFENVKKR LERDKSMALEYAYEKIALDLLPVIDALLGAHKSAAEEDKESALTKGLELTMEKLHEVLARHGIEGIECLE EFDPHFHNAIMQVKSEEKENGKIVQVLQQGYKYKGRVLRPAMVSIAKND

NP_207004.1 hypothetical protein HP0205 [Helicobacter pylori 26695] MSELVTEYANATNNLLFKELIKHVSGNSEGIKNFCQCVKEIKKCNTPNKKYNSDEFFIMGKHKQNQLAKI YSYFKKLSEGEIKPQNEDILKKLKSLDEIFKTTDFTKFTPETEVKDIIKEIDEKYPINENFKQQFRTFRL NIGNLKKKIKNSLKYLEKTRKNFERKKESLIREIEYYCKNQKTLEFDYDVLLDNIQQICKKYIASHVVND ASKDIKSMMCQFYLEKIDLLFNSEIEQYRYSDFLESARKFLWEDIKTLDEKSGVHLFPKNIGEIKDKFET NKEKFKQSKNYSEFAEYCRECNPYTAFQNLRNKVQFPLSGGLSYKSYKLVPTMKEYKEPKITDNDLKTAL FTLFDYSSPSEFDQWDWFFRNSLFRKMDFNPYNIWKNLNLDDKKDFAEILEYNMQLKINSLITKEFNKLL AIAEDSSQDSYQLKIRVRHNNKFYDYSKKSTAYEIKLEIHDCRKSHDQNEPIILSQQSTGFQWAFNFMFG FLYNVGSDFSLNKNIIYVMDEPATHLSVPARKEFRRFLKEYAHKNHVTFVLATHDPFLVDTDHLDEIRIV EKETEGSVIKNHFNYPLNNAGKDSDALDKIKRSLGVGQHVFHNPKKHQIIFVEGITDYCYLSAFKLYFNE REFKENPIPFTFLPISGLKNNPNEMKETIQKLCELDNHPIVLTDDDRKDGSDPQRAKSEQFKNANEEMHD PIRILQLSDCDRHFKQIEDCFSANDRKKYAKNKQMELAMAFKTRLLYGEKDDVMSEETKKNFLKLFEWIK KECNNLTIKKEYIKFDYNTPQML

NP_207134.1 hypothetical protein HP0336 [Helicobacter pylori 26695] MVGGGTVKKDLKKAIQYYVKACELNEMFGCLSLVSNSQINKQKLFQYLSKACELNSGNGCRFLGDFYENG KYVKKDLRKAAQYYSKACGLNDQDGCLILGYKQYAGKGVVKNEKQAVKTFEKACRLGSEDACGILNNY

SMALL POS FASTA

CAA34403.1 outer membrane protein B [Rickettsia rickettsii] MAQKPNFLKKLISAGLVTASTATIVASFAGSAMGAAIQQNRTTNGAATTVDGAGFDQTAAPANVGVALNA VITANANNGINFNTPAGSFNGLLLNTANNLAVTVSEDTTLGFITNVVHNAHSFNLTLNAGKTLTITGQGV TNAQAAATKNAQNVVVQFNNGAAIDNNDLKGVGRIDFGAPASTLVFNLANPTTQKAPLILGDNAVIANGV NGTLNVTNGFIQVSNKSFATVKAINIADGQGIIFNTDANNANTLNLQAGGTTINFTGTDGTGRLVLLSKH AAATNFNITGSLGGNLKGVIEFNTVAVDGQLTANAGAANAVIGTNNGAGRAAGFVVSVDNGKVATIDGQV YAKDMVIQSANATGQVNFRHIVDVGADGTTAFKTAASKVTITQDSNFGNTDFGNLAAQIKVPNAITLTGN FTGDASNPGNTAGVITFDANGTLESASADANVAVTNNITAIEASGAGVVQLSGTHAAELRLGNAGSIFKL ADGTVINGKVNQTALVGGALAAGTITLDGSATITGDIGNAGGAAALQRITLANDAKKTLTLGGANIIGAG GGTIDLQANGGTIKLTSTQNNIVVDFDLAIATDQTGVVDASSLTNAQTLTINGKIGTIGANNKTLGQFNI GSSKTVLSNGNVAINELVIGNDGAVQFAHDTYLITRTTNAAGQGKIIFNPVVNNGTTLAAGTNLGSATNP LAEINFGSKGVNVDTVLNVGEGVNLYATNITTTDANVGSFVFNAGGTNIVSGTVGGQQGNKFNTVALENG TTVKFLGNATFNGNTTIAANSTLQIGGNYTADCVASADGTGIVEFVNTGPITVTLNKQAAPVNALKQITV SGPGNVVINEIGNAGNHHGAVTDTIAFENSSLGAVVFLPRGIPFNDAGNTMPLTIKSTVGNKTAKGFDVP SVVVLGVDSVIADGQVIGDQNNIVGLGLGSDNGIIVNATTLYAGISTLNNNQGTVTLSGGVPNTPGTVYG LGTGIGASKFKQVTFTTDYNNLGNIIATNATINDGVTVTTGGIAGIGFDGKITLGSVNGNGNVRFADGIL SNSTSMIGTTKANNGTVTYLGNAFVGNIGDSDTPVASVRFTGSDSGAGLQGNIYSQVIDFGTYNLGIVNS NIILGGGTTAINGKIDLVTNTLTFASGTSTWGNNTSIETTLTLANGNIGHIVILEGAQVNTTTTGTTTIK VQDNANANFSGTQTYTLIQGGARFNGTLGSPNFAVTGSNRFVNYSLIRAANQDYVITRTNNAENVVTNDI ANSPFGGAPGVDQNVTTFVNATNTAAYNNLLLAKNSANSANFVGAIVTDTSAAITNVQLDLAKDIQAQLG NRLGALRYLGTPETAEMAGPEAGAISAAVAAGDEAIDNVAYGIWAKPFYTDAHQSKKGGLAGYKAKTTGV VIGLDTLANDNLMIGAAIGITKTDIKHQDYKKGDKTDVNGFSFSLYGAQQLVKNFFAQGSAIFSLNQVKN KSQRYFFDANGNMSKQIAAGHYDNMTFGGNLTVGYDYNAMQGVLVTPMAGLSYLKSSDENYKETGTTVAN KQVNSKFSDRTDLIVGAKVAGSTMNITDLAVYPEVHAFVVHKVTGRLSKTQSVLDGQVTPCINQPDRTTK TSYNLGLSASIRSDAKMEYGIGYDAQISSKYTAHQGTLKVRVNF

sp|P27053.3|FLAA_CAMCO RecName: Full=Flagellin A MGFRINTNVAALNAKANSDLNSRALDQSLSRLSSGLRINSAADDASGMAIADSLRSQANTLGQAISNGND ALGILQTADKAMDEQLKILDTIKTKATQAAQDGQSLKTRTMLQADINRLMEELDNIANTTSFNGKQLLSG GFTNQEFQIGSSSNQTIKASIGATQSSKIGVTRFETGSQSFSSGTVGLTIKNYNGIEDFKFDSVVISTSV GTGLGALAEEINRNADKTGIRATFDVKSVGAYAIKAGNTSQDFAINGVVIGKVDYSDGDENGSLISAINA VKDTTGVQASKDENGKLVLTSADGRGIKITGSIGVGAGILHTENYGRLSLVKNDGRDINISGTGLSAIGM GATDMISQSSVSLRESKGQISAANADAMGFNAYNGGGAKQIIFASSIAGFMSQAGSGFSAGSGFSVGSGK NYSAILSASIQIVSSARSISSTYVVSTGSGFSAGSGNSQFAALRISTVSAHDETAGVTTLKGAMAVMDIA ETAITNLDQIRADIGSVQNQITSTINNITVTQVNVKSAESQIRDVDFASESANYSKANILAQSGSYAMAQ ANSSQQNVLRLLQ

AAB59097.1 exotoxin type A [Pseudomonas aeruginosa PA103] MHLIPHWIPLVASLGLLAGGSSASAAEEAFDLWNECAKACVLDLKDGVRSSRMSVDPAIADTNGQGVLHY SMVLEGGNDALKLAIDNALSITSDGLTIRLEGGVEPNKPVRYSYTRQARGSWSLNWLVPIGHEKPSNIKV FIHELNAGNQLSHMSPIYTIEMGDELLAKLARDATFFVRAHESNEMQPTLAISHAGVSVVMAQTQPRREK RWSEWASGKVLCLLDPLDGVYNYLAQQRCNLDDTWEGKIYRVLAGNPAKHDLDIKPTVISHRLHFPEGGS LAALTAHQACHLPLETFTRHRQPRGWEQLEQCGYPVQRLVALYLAARLSWNQVDQVIRNALASPGSGGDL GEAIREQPEQARLALTLAAAESERFVRQGTGNDEAGAANADVVSLTCPVAAGECAGPADSGDALLERNYP TGAEFLGDGGDVSFSTRGTQNWTVERLLQAHRQLEERGYVFVGYHGTFLEAAQSIVFGGVRARSQDLDAI WRGFYIAGDPALAYGYAQDQEPDARGRIRNGALLRVYVPRSSLPGFYRTSLTLAAPEAAGEVERLIGHPL PLRLDAITGPEEEGGRLETILGWPLAERTVVIPSAIPTDPRNVGGDLDPSSIPDKEQAISALPDYASQPG KPPREDLK

AAA26380.1 cell surface antigen [Rickettsia rickettsii] MANISPKLFKKAIQQGLKAALFTTSTAAIMLSSSGALGVATGVIATNNNAAFSNNVGNNNWNEITAAGVA NGTPAGGPQNNWAFTYGGDYTVTADAADRIIKAINVAGTTPVGLNITQNTVVGSIITKGNLLPVTLNAGK SLTLNGNNAVAANHGFDAPADNYTGLGNIALGGANAALIIQSAAPSKITLAGNIDGGGIITVKTDAAING TIGNTNALATVNVGAGTATLGGAVIKATTTKLTNAASVLTLTNANAVLTGAIDNTTGGDNVGVLNLNGAL SQVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKLTDAASAVKFTNPVVVTGAIDNTGNANNGIVTFT GNSTVTGNVGNTNALATVNVGAGLLQVQGGVVKANTINLTDNASAVTFTNPVVVTGAIDNTGNANNGIVT FTGNSTVTGDIGNTNALATVNVGAGTATLGGAVIKATTTKLTNAASVLTLTNANAVLTGAIDNTTGGDNV GVLNLNGALSQVTGNIGNTNSLATISVGAGTATLGGAVIKATTTKLTDAASAVKFTNPVVVTGAIDNTGN ANNGIVTFTGNSTVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKLTNAASVLTLTNANAVLTGAIDN TTGGDNVGVLNLNGALSQVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKITNAVSAVKFTNPVVVTG AIDSTGNANNGIVTFTGNSTVTGDIGNTNALATVNVGAGTATLGGAVIKATTTKLTNAASVLTLTNANAV LTGAIDNTTGGDNVGVLNLNGALSQVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKLTNAASVLTLT NANAVLTGAVDNTTGGDNVGVLNLNGALSQVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKLTNAAS VLTLTNANAVLTGAIDNTTGGDNVGVLNLNGALSQVTGDIGNTNSLATISVGAGTATLGGAVIKATTTKL TDAASAVKFTNPVVVTGAIDNTGNANNGIVTFTGNSTVTGNVGNTNALATVNVGAGLLQVQGGVVKANTI NLTDNASAVTFTNPVVVTGAIDNTGNANNGIVTFTGNSTVTGNVGNTNALATVNVGAGLLQVQGGVVKAN TINLTDNASAVTFTNPVVVTGAIDNTGNANNGIVTFTGNSTVTGDIGNTNALATVNVGAGITLQAGGSLA ANNIDFGARSTLEFNGPLDGGGKAIPYYFKGAIANGNNAILNVNTKLLTASHLTIGTVAEINIGAGNLFT IDASVGDVTILNAQNINFRARDSVLVLSNLTGVGVNNILLAADLVAPGADEGTVVFNGGVNGLNVGSNVA GTARNIGDGGGNKFNTLLIYNAVTITDDVNLEGIQNVLINKNADFTSSTAFNAGAIQINDATYTIDANNG NLNIPAGNIQFAHADAQLVLQNSSGNDRTITLGANIDPDNDDEGIVILNSVTAGKKLTIAGGKTFGGAHK LQTILFKGAGDCSTAGTTFNTTNIVLDITGQLELGATTANVVLFNDAVQLTQTGNIGGFLDFNAKNGMVT LNNNVNVAGAVQNTGGTNNGTLIVLGASNLNRVNGIAMLKVGAGNVTIAKGGKVKIGEIQGTGTNTLTLP AHFNLTGSINKTGGQALKLNFMNGGSVSGVVGTAANSVGDITTAGATSFASSVNAKGTATLGGTTSFANT FTNTGAVTLAKGSITSFAKNVTATSFVANSATINFSNSLAFNSNITGGGTTLTLGANQVTYTGTGSFTDT LTLNTTFDGAAKSGGNILIKSGSTLDLSGVSTLALVVTATNFDMNNISPDTKYTVISAETAGGLKPTSKE NVKITINNDNRFVDFTFDASTLTLFAEDIAADVIDGDFAPGGPLANIPNAANIKKSLELMEDAPNGSDAR QAFNNFGLMTPLQEADATTHLIQDVVKPSDTIAAVNNQVVASNISSNITALNARMDKVQSGNKGPVSSGD EDMDAKFGAWISPFVGNATQKMCNSISGYKSDTTGGTIGFDGFVSDDLALGLAYTRADTDIKLKNNKTGD KNKVESNIYSLYGLYNVPYENLFVEAIASYSDNKIRSKSRRVIATTLETVGYQTANGKYKSESYTGQLMA GYTYMMPENINLTPLAGLRYSTIKDKGYKETGTTYQNLTVKGKNYNTFDGLLGAKVSSNINVNEIVLTPE LYAMVDYAFKNKVSAIDARLQGMTAPLPTNSFKQSKTSFDVGVGVTAKHKMMEYRINYDTNIGSKYFAQQ GSVKVRVNF

AAA26522.1 IpaB protein [Shigella flexneri] MHNVSTTTTGFPLAKILTSTELGDNTIQAANDAANKLFSLTIADLTANQNINTTNAHSTSNILIPELKAP KSLNASSQLTLLIGNLIQILGEKSLTALTNKITAWKSQQQARQQKNLEFSDKINTLLSETEGLTRDYEKQ INKLKNADSKIKDLENKINQIQTRLSNLDPESPEKKKLSREEIQLTIKKDAAVKDRTLIEQKTLSIHSKL TDKSMQLEKEIDSFSAFSNTASAEQLSTQQKSLTGLASVTQLMATFIQLVGKNNEESLKNDLALFQSLQE SRKTEMERKSDEYAAEVRKAEELNRVMGCVGKILGALLTIVSVVAAAFSGGASLALAAVGLALMVTDAIV QAATGNSFMEQALNPIMKAVIEPLIKLLSDAFTKMLEGLGVDSKKAKMIGSILGAIAGALVLVAAVVLVA TVGKQAAAKLAENIGKIIGKTLTDLIPKFLKNFSSQLDDLITNAVARLNKFLGAAGDEVISKQIISTHLN QAVLLGESVNSATQAGGSVASAVFQNSASTNLADLTLSKYQVEQLSKYISEAIEKFGQLQEVIADLLASM SNSQANRTDVAKAILQQTTA

AAA26524.1 IpaD protein [Shigella flexneri] MNITTLTNSISTSSFSPNNTNGSSTETVNSDIKTTTSSHPVSSLTMLNDTLHNIRTTNQALKKELSQKTL TKTSLEEIALHSSQISMDVNKSAQLLDILSRNEYPINKDARELLHSAPKEAELDGDQMISHRELWAKIAN SINDINEQYLKVYEHAVSSYTQMYQDFSAVLSSLAGWISPGGNDGNSVKLQVNSLKKALEELKEKYKDKP LYPANNTVSQEQANKWLTELGGTIGKVSQKNGGYVVSINMTPIDNMLKSLDNLGGNGEVVLDNAKYQAWN AGFSAEDETMKNNLQTLVQKYSNANSIFDNLVKVLSSTISSCTDTDKLFLHF

AAA67928.1 neutrophil activating protein [Helicobacter pylori] MKTFEILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERLVQLGHHPLVTLS EALKLTRVKDETKTSFHSKDIFKEILGDYKHLEKEFKELSNTAEKEGDKVTVTYADDQLAKLQKSIWMLE AHLA

AAC36000.1 outer membrane protein [Neisseria meningitidis] MKKALATLIALALPAAALAEGASGFYVQADAAHAKASSSLGSAKGFSPRISAGYRINDLRFAVDYTRYKN YKAPSTDFKLYSIGASAIYDFDTQSPVKPYLGARLSLNRASVDLGGSDSFSQTSIGLGVLTGVSYAVTPN VDLDAGYRYNYIGKVNTVKNVRSGELSVGVRVKF

AAB07068.1 major outer membrane protein [Chlamydia muridarum] MKKLLKSVLAFAVLGSASSLHALPVGNPAEPSLMIDGILWEGFGGDPCDPCTTWCDAISLRLGYYGDFVF DRVLKTDVNKQFEMGAAPTGDADLTTAPTPASRENPAYGKHMQDAEMFTNAAYMALNIWDRFDVFCTLGA TSGYLKGNSAAFNLVGLFGRDETAVAADDIPNVSLSQAVVELYTDTAFAWSVGARAALWECGCATLGASF QYAQSKPKVEELNVLCNAAEFTINKPKGYVGQEFPLNIKAGTVSATDTKDASIDYHEWQASLALSYRLNM FTPYIGVKWSRASFDADTIRIAQPKLETSILKMTTWNPTISGSGIDVDTKITDTLQIVSLQLNKMKSRKS CGLAIGTTIVDADKYAVTVETRLIDERAAHVNAQFRF

CAA71822.1 cjaA [Campylobacter jejuni] MKKMLLSIFTTFVAVFLAACGGNSDSGASNSLERIKQDGVVRIGVFGDKPPFGYVDEKGVNQGYDIVLAK RIAKELLGDENKVQFVLVEAANRVEFLKSNKVDIILANFTQTPERAEQVDFCLPYMKVALGVAVPQDSNI SSIEDLKDKTLLLNKGTTADAYFTKEYPDIKTLKYDQNTETFAALIDQRGDALSHDNTLLFAWVKEHPEF KMAIKELGNKDVIAPAVKKGDKELKEFIDNLITKLGEEQFFHKAYDETLKSHFGDDVKADDVVIEGGKI

AAC02243.1 outer membrane protein [Pasteurella multocida] MKKTIVALAVAAVAATSANAATVYNQDGTKVDVNGSLRLILKKEKNERGDLVDNGSRVSFKASHDLGEGL SALAYTELRFSKNVPVQVKDQQGEVVREYEVEKLGNNVHVKRLYAGFAYEGLGTLTFGNQLTIGDDVGLS DYTYFNSGINNLLSSGEKAINFKSAEFNGFTFGGAYVFSADADKQALRDGRGFVVAGLYNRKMGDVGFAF EAGYSQKYVKQEVEQNPPAAQKVFKDEKEKAFMVGAELSYAGLALGVDYAQSKVTNVDGKKRALEVGLNY DLNDRAKVYTDFIWEKEGPKGDVTRNRTVAVGFGYKLHKQVETFVEAAWGREKDSDGVTTKNNVVGTGLR VHF

AAC16068.1 catalase [Helicobacter pylori] MVNKDVKQTTAFGAPVWDDNNVITAGPRGPVLLQSTWFLEKLAAFDRERIPERVVHAKGSGAYGTFTVTK DITKYTKAKIFFKVGKKTECFFRFFTVAGERGSADAVRDPRGFAMKYYTEEGNWDLVGNNTPVFFIRDAI KFPDFIHTQKRDPQTNLPNHDMVWDFWSNVPESLYQVTWVMSDRGIPKSFRHMDGFGSHTFSLINAKGER FWVKFHFHTMQGVKHLTNEEAAEVRKYDPDSNQRDLFNAIARGDFPKWKLSIQVMPEEDAKKYRFHPFDV TKIWYLQDYPLMEVGIVELNKNPENYFAEVEQVAFTPANVVPGIGYSPDRMLQGRLFSYGDTHRYRLGVN YPQIPVNKPRCPFHSSSRDGYMQNGYYGSLQNYTPSSLPGYKEDKSARDPKFNLAHIEKEFEVWNWDYRA DDSDYYTQPGDYYRSLPADEKERLHDTIGESLAHVTHKEIVDKQLEHFKKADPKYAEGVKKALEKHQKMM KDMHGKDMHHTKKKK

AAC28879.1 putative secreted protein [Salmonella enterica subsp. enterica serovar Typhimurium str. SL1344] MSSGNILWGSQNPIVFKNSFGVSNADTGSQDDLSQQNPFAEGYGVLLILLMVIQAIANNKFIEVQKNAER ARNTQEKSNEMDEVIAKAAKGDAKTKEEVPEDVIKYMRDNGILIDGMTIDDYMAKYGDHGKLDKGGLQAI KAALDNDANRNTDLMSQGQITIQKMSQELNAVLTQLTGLISKWGEISSMIAQKTYS

AAD04290.1 vacuolating cytotoxin precursor [Helicobacter pylori NCTC 11637 = CCUG 17874 = ATCC 43504] MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGLLSWGLKQAEE ANKTPDKPDKVWRIQAGRGFNNFPHKEYDLYKSLLSSKIDGGWDWGNAARHYWVKGGQWNKLEVDMKDAV GTYKLSGLINFTGGDLDVNMQKATLRLGQFNGNSFTSYKDSADRTTRVDFNAKNILIDNFLEINNRVGSG AGRKASSTVLTLQASEGITSSKNAEISLYDGATLNLASSSVKLMGNVWMGRLQYVGAYLAPSYSTINTSK VTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPKDKPSNTTQNNANNN QQNSAQNNNNTQVINPPNSAQKTEIQPTQVINGPFAGGKDTVVNINRINTNADGTIRVGGYKASLTTNAA HLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSNANFEFKAGTDTKNGTATFNN DISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTDKVNINKLITASTNVAIKNFNINELLVKTNGV SVGEYTHFSEDIGSQSRINTVRLETGTRSIFSGGVKFKSGEKLVIDEFYYSPWNYFDARNIKNVEITRKF ASSTPENPWGTSKLMFNNLTLGQNAVMDYSQFSNLTIQGDFINNQGTINYLVRGGKVATLNVGNAAAMMF NNDIDSATGFYKPLIKINSAQDLIKNTEHVLLKAKIIGYGNVSTGTNGISNVNLEEQFKERLALYNNNNR MDTCVVRNTDDIKACGMAIGNQSMVNNPDNYKYLIGKAWKNIGISKTANGSKISVYYLGNSTPTENGGNT TNLPTNTTNNARSANYALVKNAPFAHSATPNLVAINQHDFGTIESVFELANRSKDIDTLYTHSGAKGRDL LQTLLIDSHDAGYARQMIDNTSTGEITKQLNAATTTLNNIASLEHKTSSLQTLSLSNAMILNSRLVNLSR KHTNNIDSFAKRLQALKDQRFASLESAAEVLYQFAPKYEKPTNVWANAIGGASLNNGSNASLYGTSAGVD AYLNGQVEAIVGGFGSYGYSSFSNRANSLNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKS ALLQDLNQSYNYLAYSAATRASYGYDFAFFKNALVLKPSVGVSYNHLGSTNFKSNSTNKVALSNGSSSQH LFNASANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNAARNPLNTHARVMMGGELQLAKE VFLNLGFVYLHNLISNIGHFASNLGMRYSF

FranceCosta commented 1 month ago

I get a similar error:

--------------------- WARNING ---------------------
MSG: Got a sequence without letters. Could not guess alphabet
---------------------------------------------------
[blastall] WARNING: Sequence number 1 had length 0
(<class 'KeyError'>, KeyError('P0A326',), <traceback object at 0x7fa1c700ba88>)

And the fasta file starts with:

>sp|P0A326|CATA_BRUME Catalase OS=Brucella melitensis biotype 1 (strain 16M / ATCC 23456 / NCTC 10094) OX=224914 GN=katA PE=3 SV=2
MTDRPIMTTSAGAPIPDNQNSLTAGERGPILMQDYQLIEKLSHQNRERIPERAVHAKGWG
AYGTLTITGDISRYTKAKVLQPGAQTPMLARFSTVAGELGAADAERDVRGFALKFYTQEG
NWDLVGNNTPVFFVRDPLKFPDFIHTQKRHPRTHLRSATAMWDFWSLSPESLHQVTILMS
DRGLPTDVRHINGYGSHTYSFWNDAGERYWVKFHFKTMQGHKHWTNAEAEQVIGRTREST
QEDLFSAIENGEFPKWKVQVQIMPELDADKTPYNPFDLTKVWPHADYPPIDIGVMELNRN
PENYFTEVENAAFSPSNIVPGIGFSPDKMLQARIFSYADAHRHRLGTHYESIPVNQPKCP
VHHYHRDGQMNVYGGIKTGNPDAYYEPNSFNGPVEQPSAKEPPLCISGNADRYNHRIGND
DYSQPRALFNLFDAAQKQRLFSNIAAAMKGVPGFIVERQLGHFKLIHPEYEAGVRKALKD
AHGYDANTIALNEKITAAE
>sp|P64103|FABA_BRUME 3-hydroxydecanoyl-[acyl-carrier-protein] dehydratase OS=Brucella melitensis biotype 1 (strain 16M / ATCC 23456 / NCTC 10094) OX=224914 GN=fabA PE=3 SV=1
MAEQKSSYGYEELLACGRGEMFGPGNAQLPLPPMLMIHRITEISETGGAFDKGYIRAEYD
VRPDDWYFPCHFQGNPIMPGCLGLDGMWQLTGFFLGWLGEPGRGMALSTGEVKFKGMVRP
HTKLLEYGIDFKRVMRGRLVLGTADGWLKADGELIYQATDLRVGLSKEGSAQ
>sp|P64305|IXTPA_BRUME dITP/XTP pyrophosphatase OS=Brucella melitensis biotype 1 (strain 16M / ATCC 23456 / NCTC 10094) OX=224914 GN=BMEI1772 PE=3 SV=1
MRMLEKGKLIVASHNAGKLREFDGLIGPFGFEVSSVAALGLPEPDETGTTFEENAYIKAL
AAAKATGFPALSDDSGLMVDALDGEPGVYTANWAETEDGKRDFDMAMQKVENLLQEKGAT
TPDKRKARFVSVICLAWPDGEAEYFRGEVEGTLVWPPRGNIGFGYDPVFLPDGYGKTFGE
MTAEEKHGWKPGDASALSHRARAFKLFAEKALNVVSAPAE

But the sequence mentioned in the error does not have length 0!