Closed Rob-murphys closed 4 years ago
Sorry, I am really busy these days because of an important interview. After that I will answer your question.
Because we only give you the intersection of both results.
But why bother removing the domain names that hotpep outputs? Means you can essentially only go to family level with dbcan then, or am I missing something?
Do not quite understand your question. We have the raw output of each of the three tools in the output folder, if you are interested in finding the unparsed and complete output. For hotpep, you can look at Hotpep.out if you use the standalone run_dbcan package. If you run dbCAN using our website, the link to Hotpep.out is also given in the result page. For example, http://bcb.unl.edu/dbCAN2/blastation.php?jobid=20200206134124, go to Hotpep tab -> Download Hotpep output (Frequency > 2.6, Hits > 6).
The raw output dbcan give from the hotpep run is not the same as a solo hotpep run. My question is why. Hotpep gives a domain name which quickly can allow for narrowing down to an EC level. the dbcan hotpep output give signatures peptides but no dmain names. My question is why this reduction in information?
For your first question, I got the identical raw outputs from a run with dbCAN all three methods (http://bcb.unl.edu/dbCAN2/blastation.php?jobid=20200206134124) and a run with dbCAN with only hotpep (http://bcb.unl.edu/dbCAN2/blastation.php?jobid=20200207102306): again go to the Hotpep tab and check the raw output. If you are talking about "solo hotpep run" through Peter Busk's original version of hotpep written in ruby, I do not have an answer for you. But when we rewrote the ruby into python (working with Peter), we do have checked very carefully and made sure that the ruby and python versions produce the same outputs. So you better provide us some actual data/images so that we can test run ourselves.
For your second question, I now understand that you ask for something like figure 5 (col Functions) of https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1625-9. That information can be easily extracted from PPR. We will add that function in our next release of dbCAN.
Yanbin
Hotpep do not give domain names. What you mean is PPR subfamilies which associate with EC numbers. The association information is stored in the CAZY_PPR_patterns directory (both ruby and python version) and continuously updated by Yin Lab (python version). You can simply get the EC number of a protein by opening the Hotpep.out and finding the PPR subfamily number in the second column (for example, GH1 PPR 36), then opening the corresponding file in CAZY_PPR_patterns directory (in this case, open the file GH\GH1\GH1_group_ec.txt). So you will quickly obtain the EC number of this protein at line36: 3.2.1.21. However, in the latest standalone run_dbcan package v2.0.3, some scripts were added to directly give the PPR subfamily number (in brackets) in the final overview table. But before running run_dbcan.py you may need to change the .add() into .update() at line 329 to avoid a type error.
I will try these and look for this information thanks. But no I am specifically talking about the doamin names given in the summary_all_enxzymes.txt file output by hotpep but looking at it now I guess as that is a summary file it is not used by db_can. Thank you
Which version of db_can are you using? The parameters (hit_cut_off and freq_cut_off) of ruby and python Hotpep are different.
I am using the python version
Try these: --hotpep_hits 3 --hotpep_freq 1.0
In the original Hotpep literature, a hit was considered significant if the protein sequence:
However, in run_dbcan pipeline the default values of these parameters were once set to be 4 and 2.0 and now changed into 6 and 2.6 in the latest vesion.
Also after looking more into the hotpep output, it provides EC numbers directly. Can this information be accessed in a dbcan ran hotpep?
@Lamm-a Here is the output of Hotpep(python version). There is no EC number in our output files. Where did you find EC number?
CAZy Family PPR Subfamily Gene ID Frequency Hits Signature Peptides
CE0 8 NP_418729.1 70.0 70 DFQQQG,ALHIAR,PDEDPD,THCPHD,MTSDYA,WKENFP,ATNHQT,AGQSNA,KQLARF,IYGNYQ,FDLMTS,RSSHFS,GQSNAM,CCRGGS,VPCCRG,NHQTQY,DEDPDD,HHPLAT,PDDLST,GEFDLM,SEGTYS,QHFNHM,KENFPH,RTRAAL,APWFCG,LMTSDY,LTHCPH,GLPLPD,AYGEGL,NAMAYG,IPLTHC,CRGGSA,RIKQLA,GASHDA,NIIFVD,DLSTGY,GSAYRS,LATNHQ,DACRWG,HPLATN,TTWYWK,FVDFQQ,TNHQTQ,GNYQNN,RGGSAF,WYWKEN,EFDLMT,THPGGP,HFNHMV,DTPLYQ,GSEGTY,DLMTSD,CGDTTW,DTTWYW,EGTYSE,TPLYQD,IKQLAR,HGASHD,MQGEFD,FVEAIL,HCPHDV,PWFCGD,GQALHI,LHIARK,LYQDLV,HIARKL,YQDLVS,AMAYGE,QALHIA,GDTTWY
CE1 7 NP_418175.3 57.8 69 PQRQVN,IMDNLL,WTPPGY,FVQKLF,KDIAGL,PVLYFY,WLATFS,DIAGLK,LAGLSQ,GGYQAL,ALAGLS,RALAGL,ALVSGM,AEGKIK,SFGWLA,EMDVWR,TGKDIA,TFSGVT,DIIPLI,GRIPQI,IPQIMD,ILVPGS,GKIKPM,QIMDNL,VLYFYH,YHGFGD,YVWTPP,QLRNFT,PQIMDN,VWRPAY,MDNLLA,KIKPML,VTGKDI,QGGYQA,SQGGYQ,QALVSG,LYFYHG,FYPLNA,LAEGKI,GLSQGG,VWTPPG,DNLLAE,PLPVLY,PDTETD,GRALAG,EGKIKP,HEMDVW,GYQALV,QGRIPQ,FYHGFG,GWLATF,LSQGGY,LATFSG,FSGVTT,NLLAEG,WRPAYA,YFYHGF,MDVWRP,AGLSQG,IKPMLV,GKDIAG,FGWLAT,LPVLYF,YQALVS,ATFSGV,DVWRPA,LLAEGK,RIPQIM,KPMLVV
CE4 2 NP_415542.1 12.3 25 LQAFAD,GILFHD,GYYPDN,YGYYPD,TARNIF,ARNIFA,EAWFAQ,SFYTRV,AWFAQN,PESEAW,LFHDDA,QAFADP,RLSPFD,SEAWFA,NYGYYP,LTFDDG,ILFHDD,AWMPVL,LDYVYD,TFDDGY,YYPDNF,APVGSW,RVAWQL,YAWMPV,DLDYVY
CE4 5 NP_414672.1 51.4 64 TFDDGL,IKRHPQ,QFNPHV,EENTRF,AYPVLK,RFRHTS,STTTSV,LSYPFG,SHTHFL,AGFHLA,DEENTR,RTDSLE,HTHFLH,GFHLAV,ITFDDG,SRIKRH,LYILRT,RIKRHP,ENTRFR,YAYPVL,HLAVTT,DGLKSV,LTYHHI,TTSVRA,DLNKPL,KRLYIL,KVKPGD,GKVKPG,SYPFGG,TTTSVR,NLRYPI,NTRFRH,HILRDE,QSHTHF,RDEENT,SLQFMS,FHLAVT,RLYILR,LRTDSL,YILRTD,RSRRAL,GDLNKP,ILRDEE,LKRLYI,ILRTDS,LGDLNK,HTSTTT,YLSYPF,VKPGDN,TRFRHT,HHILRD,FRHTST,RHTSTT,KLKDRL,FDDGLK,LRDEEN,YHHILR,THFLHR,TSVRAF,DDGLKS,TYHHIL,RYAYPV,KPGDNP,LLKRLY
CE8 2 NP_415293.1 63.5 70 LAFGVT,VNILGR,GKPAWY,VTNSGV,PGKPAW,NNGLQL,GAVVFD,AFGVTL,FFVTNS,VFAPAT,QNNGLQ,VAQLGR,TFFVTN,VNSRTQ,QRNLND,INEGFN,WSQNNG,GRQNTF,YVFAPA,KPAWYM,LGRQNT,VVGPAG,FWSQNN,VYVPAA,ALAFGV,VVNSRT,ACSSTP,AVVFDN,NGQVVI,IQAAVD,RQNTFF,WEYNNR,SGRGAV,RTLVTN,EYNNRG,SRLALA,SQNNGL,LVTNSY,QVVIRD,LGDSVD,GTVYVP,FVTNSG,LQLQNL,NILGRQ,RMWEYN,MPGKPA,GQVVIR,TLVTNS,KYMPGK,GKYMPG,RGAVVF,QNTFFV,MWEYNN,TVYVPA,GLQLQN,VSGRGA,ILGRQN,NGLQLQ,NTFFVT,NRMWEY,GRGAVV,AAVDAA,VVFDNT,GDSVDA,LQNLTI,VVIRDS,DSVDAG,QAAVDA,YMPGKP,QLQNLT
CE9 1 NP_415203.1 45.4 69 VQLNGC,LNGCGG,KTIYYR,NGCGGV,FIFAGK,FIDVQL,ADVITK,PARAIG,IFAGKT,YPARAI,DENGTL,VTDATA,VANLTA,KKGTHN,AGKTIY,NAMPYI,TLEIMQ,NADVIT,QALGLH,GCGGVQ,VTLAPE,KVTLAP,TFATHL,RMATLY,GREPGL,AGIVVS,VSAGHS,IDVQLN,TLYPAR,GFIDVQ,CVDENG,FAGKTI,GLCVDE,LCVDEN,DVQLNG,LGLHLE,ANLTAF,ENGTLS,ATLYPA,SAGHSN,KGTHNP,ALGLHL,GIIADG,TAPAGA,LHLEGP,GKTIYY,DATAPA,NLTAFT,GTLSGS,LPTLIT,QLNGCG,GVQFND,ETLEIM,MATLYP,VDENGT,NQALGL,GIALDE,LVTDAT,NGTLSG,GLHLEG,CGGVQF,ATAPAG,ARAIGV,LRMATL,TDATAP,LYPARA,GHSNAT,GGVQFN,AGHSNA
CE11 1 NP_414638.1 20.8 50 MIKQRT,IPIMDG,RARTFG,EFVRHK,PIMDGS,SGHALN,MLDAIG,LDAIGD,ATGVGL,SRARTF,IKQRTL,GDKWAE,RTFGFM,ARTFGF,TLRPAP,RISTVE,LAGLGI,ISRART,DGDKWA,TVEHLN,LNEDGL,VLNEDG,VEDGDK,FMRDIE,AGLGID,HKMLDA,ALAGLG,AYKSGH,GLGIDN,KSGHAL,TFGFMR,LTLRPA,NEDGLR,EIPIMD,APEIPI,ISTVEH,IMDGSA,DAIGDL,PEIPIM,MDGSAA,TGVGLH,STVEHL,FGFMRD,GHALNN,LNNKLL,KQRTLK,KMLDAI,EDGDKW,DEFVRH,GFMRDI
CE11 9 NP_414722.1 3.26 8 EPFFQG,FFQGHF,EAMAQA,FQGHFP,PFFQGH,PFLLVD,QGHFPG,NEPFFQ
CE14 12 NP_414898.4 38.7 55 AHPDDI,GAHPDD,QLNDMI,AVYQAS,FADTRA,LSFMPQ,YQASMV,QILGYE,PQVFES,HPDDIE,DTRAHL,QASMVA,RAHLQL,RVYTMH,CGASLA,YETPST,VMTTGN,AHLQLN,QDHLAV,IPQILG,ALKILG,ADTRAH,LGYETP,PQILGY,LAVYQA,ASLARL,GCGASL,DIELGC,GASLAR,DRHQDH,SMVACR,RHEESR,PDDIEL,LGCGAS,LKILGC,QVFESV,GYETPS,HEESRN,IGAHPD,ASMVAC,AIGAHP,DDIELG,STWLSF,EESRNA,IELGCG,NQIPSD,GILAIG,VYQASM,ELGCGA,IPSDVE,ILAIGA,LAIGAH,TRAHLQ,ILGYET,SVKEEY
GH0 50 NP_416008.1 42.6 50 FNPYRV,NTVFFQ,FDDYFY,RGIWLA,KRGMKV,AHKRGM,SKADWR,WLATVS,VLDPGI,SYADTR,QGLLDY,TVFFQV,EAHKRG,MLDEAH,ASKADW,VSPAGV,ILFRED,KVHAWF,GDRFVL,APQIYW,HAWFNP,YDESYA,LKKQLD,DESYAD,ESYADT,QFMLDE,HKRGMK,GSDTRG,LDWPPV,FFQVKP,LDEAHK,GVPELK,LDYIAP,TVSRLD,VPELKK,MKVHAW,QIYWPF,SPAGVW,GMKVHA,YDPLQF,PGYDPL,ARYDVL,YIAPQI,FREDYL,VWRNRS,RGMKVH,QLDLND,LFREDY,GIWLAT,DGVQFD
GH1 1 NP_417377.1 18.5 39 FIVENG,KYWMTF,WMTFNE,DFLWGG,RTSIAW,TSIAWT,RYGFIY,WGWQID,IAWTRI,DDYRID,AWTRIF,FNEINN,PLFIVE,FLWGGA,YGFIYV,VIASNG,YRIDYL,AEMGFK,GFSYYM,YWMTFN,DYRIDY,KRYGFI,GWQIDP,FAEMGF,GYTPWG,GGAVAA,LFIVEN,VASALA,SIAWTR,LFAEMG,MTFNEI,TFNEIN,SKRYGF,LWGGAV,FRTSIA,VKYWMT,KVKYWM,WTRIFP,WGGAVA
GH1 1 NP_418177.1 16.3 32 GWQIDP,IVENGL,DRYQKP,DFYHRY,PLFIVE,FLWGGA,YGFIYV,VENGLG,WLTFNE,FIVENG,RYGFIY,IDLVSA,FAEMGF,LFIVEN,GFIYVD,SKRYGF,LTFNEI,PNEAGL,TFNEIN,ENGLGA,KPLFIV,ALFAEM,MSKRYG,EDIALF,YQKPLF,DIALFA,IDFYHR,KRYGFI,RYQKPL,AIDFYH,WGWQID,LFAEMG
GH1 1 NP_417196.1 12.0 27 RYQKPL,LTFNEI,RTSIAW,MSKRYG,WGCIDL,YWLTFN,VKYWLT,CIDLVS,FLWGGA,YKEDIA,VIASNG,FYHRYK,AEMGFK,KEDIAL,WLTFNE,DRYQKP,DFYHRY,ENGLGA,IDLVSA,YQKPLF,VASALA,VENGLG,TFNEIN,SKRYGF,FRTSIA,KYWLTF,GCIDLV
GH2 1 NP_414878.1 28.3 64 CEYAHA,AVRCSH,PLLIRG,YGGDFG,RGVNRH,LCEYAH,RHEHHP,NFNAVR,GIFRDV,EDQDMW,LGNESG,NRHEHH,WDWVDQ,PLILCE,LILCEY,PVQYEG,ENYPDR,PRLQGG,ANIETH,IETHGM,RPLILC,HYPNHP,GLYVVD,IRGVNR,YVVDEA,DGSYLE,IWSLGN,VVDEAN,VDEANI,EYAHAM,GSYLED,SLGNES,WSLGNE,YAHAMG,LYVVDE,MSGIFR,AYGGDF,VQYEGG,GGDDSW,PMYARV,AHAMGN,QYEGGG,SGIFRD,DEANIE,HAMGNS,FNAVRC,RCSHYP,YGLYVV,RLQGGF,RPVQYE,GVNRHE,SYLEDQ,YLEDQD,NAVRCS,LIRGVN,VRCSHY,NIETHG,VNRHEH,ILCEYA,LEDQDM,EANIET,IIWSLG,CSHYPN,GGDFGD
GH2 2 NP_416134.1 18.2 51 GIVVID,MWSIAN,YYGWYV,LIARDK,QTIPPG,TSHYPY,DKNHPS,DFFNYA,YTPFEA,PGEGYL,NHPSVV,RYYGWY,HYPYAE,VMWSIA,CLNRYY,RTSHYP,YAGIHR,GEGYLY,NYAGIH,FFNYAG,PSVVMW,WSIANE,WIGANS,IARDKN,NRYYGW,WNFADF,HPSVVM,RDKNHP,GEQVWN,ARDKNH,YPYAEE,YFHDFF,SHYPYA,TVCVNN,GIHRSV,IANEPD,AGIHRS,SIANEP,FHDFFN,VVIDET,VWNFAD,TPFEAD,IVVIDE,KNHPSV,PYAEEM,HDFFNY,CVNNEL,VVMWSI,LNRYYG,WSEEYQ,FNYAGI
GH2 4 YP_026199.1 7.98 26 WEWCDH,LGNESG,HYPNDP,GNESGY,AHAMGN,YAHAMG,KNHPSI,IFRDVY,SLGNES,EYAHAM,KQHNIN,NESGYG,WSLGNE,AMGNGP,YYGRGP,LMKQHN,GIFRDV,VWEWCD,GVNRHD,MGNGPG,HGVNRH,FYELCD,IWSLGN,FRDVYL,HAMGNG,MKQHNI
GH3 1 NP_416636.1 38.3 69 YGLSYT,VLMNGR,LVLVLM,GNAIAD,GAVEGG,NPSGKL,EGFGED,FFAYDV,FGDYNP,YNTVDM,PSGKLP,LSYTTF,SRLKIP,LFGDYN,YGAVEG,ESRLHR,PRSVGQ,ATGKPL,AIADVL,PLVLVL,ALKATG,KIGQLR,QLRLIS,GLSYTT,GKPLVL,YDMGLF,SWSAAG,FGYGLS,GGNAIA,KPLVLV,EGGNAI,TGKPLV,SVGQIP,GTEGGN,YPFGYG,AESRLH,RSVGQI,PFGYGL,NAIADV,IADVLF,YNPSGK,RDPRWG,GFGEDT,DPRWGR,DYNPSG,FPRSVG,NTVDMS,GEDTYL,EKIGQL,LYPFGY,GQLRLI,KYDMGL,MNGRPL,VLFGDY,GDYNPS,GYGLSY,ADVLFG,VLVLMN,SDHGAI,LKATGK,IGQLRL,FGEDTY,SEGFGE,TEGGNA,GSWSAA,LMNGRP,KATGKP,DVLFGD,LVLMNG
GH3 3 NP_415625.1 25.1 55 DISFAP,HFPGHG,PAHVIY,DDLSME,FDGVIF,VIFSDD,FTRLPA,RVQRFR,KHFPGH,DLSMEG,IFSDDL,VQRFRE,TGKHFP,CDMILV,HPLVGG,GFTRLP,LSMEGA,FPGHGA,DGVIFS,GKHFPG,TTGKHF,GRVQRF,GGRVQR,SFAPVL,GCDMIL,SDDLSM,ILFTRN,GGLILF,SMEGAA,PGHGAV,SHKETP,VDQEGG,RFREGF,AGCDMI,DQEGGR,FSDDLS,EGGRVQ,FAPVLD,DMILVC,MPAHVI,DIDISF,QRFREG,AVDQEG,PVMLDV,GVIFSD,IMPAHV,LDAGCD,DAGCDM,IDISFA,GPVMLD,ISFAPV,QEGGRV,MILVCN,AIMPAH,VGGLIL
GH4 1 NP_416248.1 29.1 55 AGMVTE,TTQLRV,GLRTIP,KDADFV,MVTEAV,LRTIPV,GGGSSY,TLDRRE,TQLRVG,DADFVT,QETNGA,FTNPAG,GFIKRY,TNPAGM,TIGGGS,LDRREA,LKDADF,ETNGAG,SYTPEL,EGFIKR,GGAYYS,NPAGMV,DERIPL,ELWLVD,ADFVTT,GGSSYT,GAYYSD,YSDAAC,VTIGGG,GGLFKG,GQETNG,GMVTEA,VTTQLR,AYYSDA,YYSDAA,PAGMVT,GSSYTP,IGGGSS,FVTTQL,DFVTTQ,LFKGLR,LRVGQL,ALKDAD,FKGLRT,GLFKGL,KGLRTI,LFELYK,RTIPVI,SSYTPE,RGGAYY,NFTNPA,INFTNP,SDAACE,GLNHMV,TNGAGG
GH4 3 NP_418543.1 34.5 61 EHFAEY,IGGYEP,IGAGST,RALRTI,GYEPCT,DEYPKR,TIADTL,MDPHTA,EYPKRC,GLCHSV,VGLCHS,TESSEH,VTESSE,ITFIGA,YVNPMA,GYFVTE,FVVVAF,WTGEPS,SSEHFA,DFVVVA,SVQGTA,MMDPHT,SEHFAE,HSVQGT,GGYEPC,HMAFYL,FQIGGY,AGINHM,INHMAF,ESSEHF,TFIGAG,VVAFQI,CHSVQG,FAEYTP,VAFQIG,KITFIG,AFQIGG,FVTESS,QTIADT,HFAEYT,QVGLCH,ADFVVV,GINHMA,FIGAGS,TGEPSV,LGYFVT,YEPCTV,HGDWLP,PLDEYP,VVVAFQ,YFVTES,TNINVQ,DPHTAA,LCHSVQ,QIGGYE,NHMAFY,EPCTVT,IADTLG,PCTVTD,LDEYPK,CTVTDF
GH8 1 NP_417988.1 22.3 46 GWDQHR,KITTSE,RLLLET,RVIDPS,SLLEAG,ITTSEG,ALAAND,ETAPKG,TTSEGQ,LPAWLW,GRVIDP,LLEAGR,GQGWDQ,RFNPSY,GPVGFS,VGFSAA,PKGFSP,APKGFS,FNPSYL,TAPKGF,SEGQSY,GFSPDW,LTLFGQ,EAGRLW,NPSYLP,AGRLWK,DAIRVY,FSPDWV,PAWLWG,EGQSYG,PSYLPP,YDAIRV,LLETAP,MLLPGK,PVGFSA,FGQGWD,TLFGQG,VLTLFG,SYDAIR,LETAPK,WRFNPS,LAANDR,TSEGQS,LEAGRL,LFGQGW,LLLETA
GH13 1 NP_418660.1 15.0 37 EIGMTN,GEEIGM,YIYQGE,SRDNSR,YQGEEI,QGEEIG,YYLHLF,GEMSST,QADLNW,EEIGMT,VGEMSS,QGTPYI,DNSRTP,PYIYQG,IYPKSF,DLNWEN,SRTPMQ,HHLKVD,CNHDQP,FGGSAW,GTPYIY,YQIYPK,NSRTPM,TVGEMS,FNFHHL,QYYLHL,IYQGEE,QIYPKS,RTPMQW,HLKVDY,FHHLKV,WCNHDQ,RDNSRT,TPYIYQ,LNWENP,NFHHLK,NHDQPR
GH13 2 NP_417889.1 20.5 40 NNNAYC,TGCGNT,GYRVHG,NYWGYN,AGIEVI,HVDGFR,GFRFDL,IYEAHV,GNNNAY,AEPWDI,NNAYCQ,GYQVGN,VDGFRF,GIEVIL,IEVILD,AHDGFT,GTPMLL,NEANGE,MLLAGD,FRFDLA,HDGFTL,KLIAEP,KHNEAN,DGFRFD,HNEANG,GGYQVG,IAEPWD,LIAEPW,VKLIAE,PMLLAG,GHRFNP,TAHDGF,NAYCQD,QVGNFP,AYCQDN,YQVGNF,VTAHDG,YGYRVH,LLAGDE,GFTLRD
GH13 3 NP_417890.1 20.2 40 DAVASM,EVVHGK,PLSHDE,ENLEAI,VWAPNA,PGKKLL,RVDAVA,GKKLLF,VILDWV,ILDWVP,EVHLGS,GRENLE,SIYEVH,DEVVHG,SWGYQP,WDGRRH,LSHDEV,GGRENL,YRDYSR,YEVHLG,RENLEA,KFANLR,HDEVVH,RVSVVG,WMHDTL,DGSWGY,WQKFAN,ANLRAY,LPLSHD,LRVDAV,HLGSWR,SHDEVV,WVPGHF,VHLGSW,DWVPGH,KLLFMG,GWMHDT,VDAVAS,GSWGYQ,KKLLFM
GH13 11 NP_414937.2 15.3 42 VVHMLG,KYDTED,QIFPDR,IYYGDE,FNQLDS,DGVFNH,FYGGDL,YGDEVG,QLDSHD,HDTARF,RLDVVH,FPDRFA,ALYLNP,PCIYYG,SVHKYD,WRLDVV,IFPDRF,YYGDEV,DGWRLD,VTALYL,YQIFPD,VHMLGE,GVTALY,CIYYGD,EHFGDA,HKYDTE,TALYLN,LDSHDT,HFGDAR,VFNHSG,LDGVFN,NQLDSH,YDTEDY,DVVHML,DSHDTA,FYQIFP,GWRLDV,VLDGVF,SHDTAR,LGVTAL,LDVVHM,GEHFGD
GH13 13 NP_416437.1 21.9 47 LFDLGE,HDTQPL,YALILL,LDAVKH,LEAPVE,NHDTQP,NGGSVS,DLGEFD,GADEKE,DAVKHI,FDLGEF,FDYLMG,FRLDAV,EFDQKG,GYSVGY,DGFRLD,LFDAPL,GFRLDA,PPAYKG,WFKPLA,LILLRE,WLPPAY,LPPAYK,FKPLAY,YDLFDL,DLFDLG,GGSVSV,DTQPLQ,KPLAYA,GEFDQK,TKYGDK,PWFKPL,NFDYLM,AYALIL,GSVSVW,HIPAWF,FHWYYP,GNFDYL,YSVGYD,RLDAVK,GGYSVG,KHIPAW,LAYALI,SVSVWV,PLAYAL,LGEFDQ,ALILLR
GH13 24 NP_418028.1 55.2 70 GLLLLE,RTDIGD,GTFHGG,PHYAYH,GYATLA,AYHGYY,NVLSYL,DIGDYD,GFDAMI,YAYHGY,YATLAD,LDANMG,GTRSDM,DTRSGT,WHSFND,IDGFRV,IHGWVG,DYGIDG,ATVYFV,WIRTDI,YGIDGF,HSFNDY,DGFRVD,SLAFLP,HYAYHG,QGTRSD,GDFPHY,FWMTGE,IRTDIG,TLADMQ,VLSYLS,GWVGGG,AMINFD,FNVLSY,DFPHYA,SHDTRL,LSSHDT,GIDGFR,FHGGDL,EIGTFH,FRVDTA,PFWMTG,LSYLSS,EQIHGW,TDIGDY,FDAMIN,VDTAKH,RVDTAK,GPTGSD,GFRVDT,PTGSDP,TFHGGD,SSHDTR,SFNDYI,IGTFHG,SYLSSH,YLSSHD,HDTRLF,KGDFPH,LAFLPD,DVVMNH,HGWVGG,FPHYAY,QIHGWV,ATLADM,DAMINF,RSGTPT,NATVYF,TAKHVE,DTAKHV
GH13 37 NP_415825.4 37.0 69 LDAVGF,ITETNV,VVLITY,MVYQFS,FDFVCN,HDGIGL,NRAINR,TNVPHK,IQSILG,DGIGLN,DFVCNH,FPGVPA,VGFMWK,GIGLNP,ITYADQ,DQIDLN,GSRNDY,FNFLAS,LMFDFV,VTRPRA,PPLVLH,SPYEIN,YQFSLP,QFSLPP,SDDGFS,EAHMVY,FSLPPL,MFDFVC,ETNVPH,TETNVP,LGSRND,PYEINV,ASHDGI,AVGFMW,IITETN,FLASHD,ILGSRN,YEINVT,LASHDG,GVPAIY,QSILGS,TYADQF,DGFSVI,VIITET,CNHMSA,FSVIDY,SILGSR,SSDDGF,DAVGFM,NFLASH,LPPLVL,YIQSIL,AHMVYQ,VLITYA,EINVTY,PLLTPF,GFSVID,DDGFSV,FVCNHM,VYQFSL,DVVLIT,LITYAD,VCNHMS,LWTTFS,HMVYQF,VHLLPF,PLVLHA,SHDGIG,RLDAVG
GH18 36 NP_417797.1 64.0 70 FDIEGT,GANNAP,YAPYVD,QNIHGK,PTWGTA,ILPTGL,IKALRE,VDNLNL,GEVFYL,CATSAI,SKIKAL,TEGQNI,GQNIHG,TWGTAY,IHGKCA,VLDFDI,YTLPIL,QHYYDI,VDFTLN,VQGEVF,NNAPLA,GTWVAD,GTKERV,VFYLSD,QYGNPT,KIKALR,AWWVGA,YSKIKA,QYSKIK,ALREAG,YAQYSK,TLPILP,GNPTTC,IVDNLN,YVDFTL,EGQNIH,NAWWVG,LREAGG,DIEGTW,REAGGD,SQYGNP,KALREA,IEGTWV,ANNAPL,HYYDIV,PWKPLG,QGEVFY,MTMDYG,DVQGEV,EGTWVA,IGGANN,RPWKPL,GGANNA,DIVDNL,APYVDF,NAPLAA,YGNPTT,GKCATS,WYTLPI,SIGGAN,DFDIEG,IWYTLP,LDFDIE,KCATSA,HGKCAT,LPTGLT,AQYSKI,NIHGKC,NDVQGE,PYVDFT
GH23 1 NP_417053.2 24.0 55 GKLDYT,SPTGVR,DWRLLA,YVENIR,YGYARG,GYARGH,HMLDAR,EKYLGH,ATSPTG,FDYVDT,VRGMMM,YSVSQQ,ALAAYN,TGVRGM,QATSPT,TYGYAR,GLDYEL,YNMGYA,WRLLAA,LAAYNM,IDWRLL,VSQQLV,YQESHW,RLLAAI,SLVGYL,LDYELA,SQQLVY,PTGVRG,GMMMLT,DYVDTR,YYSVSQ,FADYLG,ADYLGV,QESHWD,ARGHEA,LFDDLD,RGMMML,IWFALA,EEKYLG,FALAAY,AYNMGY,YLGVKL,AAYNMG,GNPDSW,GVRGMM,MGYAHM,SVSQQL,RGHEAY,WFALAA,AYQESH,TSPTGV,YARGHE,RIWFAL,NMGYAH,ERIWFA
GH23 3 NP_418809.4 43.6 69 VNELAR,ESAWNP,FYPMVA,WRYWQA,SAWNPK,VLAYDA,VGASGL,NAGPGR,LSVQAT,YNAGPG,YAMAIA,VAWRLM,QESAWN,AAYNAG,VESIPF,PFSETR,DLSVQA,ERFPLA,YVKNVL,NVLAYD,ARQESA,IGTSYL,IARQES,LLAFSP,ASGLMQ,RELMYW,VRELMY,SAAYNA,RQESAW,DEWRYW,FVNELA,IPFSET,NELARR,GYVKNV,AYNAGP,KDEWRY,AMAIAR,YPLYPY,PLYPYL,RVRELM,PMVAAQ,TRGYVK,SVQATI,FSETRG,ETRGYV,LARYAF,EERFPL,MAIARQ,YPMVAA,RGYVKN,ESIPFS,SYAMAI,KNVLAY,GFYPMV,LEERFP,SETRGY,AIARQE,RGFYPM,SPVGAS,AGPGRV,VKNVLA,ARVREL,GASGLM,SIPFSE,PTLPPA,PVGASG,RFVNEL,GPGRVR,AWRLMG
GH23 4 NP_414747.1 18.9 50 AYNSGE,EPYMYW,TLRAEP,MELVLL,STTAAL,TGRNYG,YLHDVT,SGEGRV,RAEPYM,PIVESA,RKGDSL,MFDGDW,LRAEPY,GDWLLT,YVPKML,AAYNSG,PMELVL,ESAFDP,HDVTLR,IVESAF,RRDVVA,MPMELV,FDGDWL,NSGEGR,VRKGDS,DVVAST,PYMYWI,SYLHDV,LVLLPI,DVMRWN,KSYLHD,LPIVES,DGDWLL,IPSTGR,YNSGEG,DVTLRA,RDVVAS,VLLPIV,LHDVTL,VRSGDT,VESAFD,GRNYGL,ELVLLP,VTLRAE,RNYGLK,VPKMLA,PKMLAL,AEPYMY,ASTTAA,LLPIVE
GH23 7 NP_417438.2 48.3 70 LDKRAH,HLDKRA,YNGGAG,LLMGDD,FLYGQV,SSFNPY,RYAVIT,GAGSVL,LGLMQV,TAYNGG,TSRRYA,SRRYAV,TESSFN,GSVLRV,QTESSF,KDYVKY,LMQVVQ,ITAYNG,KDTNGF,VPNHLD,EPFLYG,SFNPYA,PNHLDK,RRYAVI,MQVVQH,AGPKDY,AYNGGA,RSHINF,PFLYGQ,LILAIM,IAGPKD,DTGTAY,ESSFNP,FNPYAV,SLILAI,NPYAVS,PKDYVK,KRAHKY,MQTESS,LAIMQT,DKRAHK,ILMGQF,SVLRVF,ESRRYL,NGGAGS,IENIWG,AVITAY,ILAIMQ,VLRVFS,YVKYTD,DYVKYT,GGAGSV,TLLMGD,ALGLMQ,LMGQFA,AIMQTE,SHINFD,DILMGQ,IMQTES,NIDTGT,GPKDYV,AGSVLR,TRSHIN,GLMQVV,IDTGTA,FDILMG,YAVITA,VITAYN,DALGLM,NHLDKR
GH23 9 NP_415711.2 24.7 48 ESGGNP,SYANGA,NAIGLM,AGALLR,LLRTFS,HPAPQA,SNAIGL,RTFSSD,LKNPER,AGCSSK,NPERNI,YANGAG,VSKSNA,ALLRTF,VSYANG,LAGCSS,NHPAPQ,PERNIS,QWMPIS,TFSSDR,FSSDRK,APQAPR,GALLRT,ELKNPE,MQWMPI,RAMQWM,AIGLMQ,KNPERN,LMQLKA,PQAPRY,YALVVS,ANGAGA,QAPRYI,AMQWMP,NGAGAL,GLMQLK,RNISMG,GAGALL,PAPQAP,ITAIIA,LRTFSS,ERNISM,QYALVV,QLKAST,MQLKAS,ALVVSY,SELKNP,LITAII
GH24 1 NP_416072.1 49.7 65 DKALAW,ILDQFL,ASFCPY,RRDQES,SFCPYN,NIGPGK,NNCYGQ,EPQKAG,WTICRG,KCFPST,GIASFC,TEPQKA,DQFLDE,FCPYNI,KAGIAS,WWIKDG,RDKALA,FPSTFY,SNNCYG,ACEAIR,TICRGA,KDGGRD,IGPGKC,AIERDK,IASFCP,RWWIKD,RSNNCY,FLDEKE,ICRGAT,AIRWWI,LTEPQK,CEAIRW,GIWTIC,KALAWV,WIKDGG,DQESAL,GPGKCF,PGKCFP,EAIRWW,GKCFPS,YNIGPG,RDQESA,DGGRDC,VPLTEP,ERDKAL,PQKAGI,CFPSTF,GACEAI,LDEKEG,PYNIGP,IERDKA,CPYNIG,PLTEPQ,NAIERD,QKAGIA,IRWWIK,QFLDEK,VNAIER,IRSNNC,AGIASF,GGRDCR,IKDGGR,IWTICR,LSAAVL,DEKEGN
GH24 3 NP_415087.1 48.7 69 RWTYAG,ALLNKD,YSFVYN,RGALYS,NVGAGN,GAIAIA,REIERE,KINQGD,INPYIK,VARQIN,VGVWTV,GALYSF,VLITGP,KGACDQ,INQGDI,GNFRTS,TVARQI,DGLEGV,HTGKDI,GGKQWK,LLNKDL,EIEREV,IEREVC,KQWKGL,LRRWTY,RREIER,SFVYNV,ASVLIT,FVYNVG,LYSFVY,TLLRKI,DQLRRW,RRWTYA,GHTGKD,TRREIE,AGNFRT,GACDQL,STLLRK,QWKGLM,YAGGKQ,SVLITG,LRKINQ,GKQWKG,VYNVGA,ARQINP,GNDGLE,NFRTST,CDQLRR,RKINQG,WTYAGG,LLRKIN,QINPYI,GAGNFR,RQINPY,AIAIAS,YNVGAG,VGAGNF,QLRRWT,TYAGGK,GVWTVC,LNKDLA,ALYSFV,TGKDIM,EREVCL,ACDQLR,NDGLEG,AGGKQW,REVCLW,TSTLLR
GH25 8 NP_416605.2 61.8 70 NFFYST,EERGKL,QVDGIN,RGQVDG,NEYPWW,RPVKSF,ELRKRV,IKATEG,LFLQTV,AFIKAT,FIKATE,RKRVSQ,SDRGQV,AKELRK,IHGIDV,PIIYSG,PAVLDV,VLDVEE,QWLKMV,GAYHYF,KELRKR,YFSPSV,RGKLSA,VNFFYS,DRGQVD,GIDVSR,LQTVDF,VDFNVF,GKLSAK,TIHGID,RGAYHY,GQVDGI,IRLQFA,LRGAYH,YHYFSP,HYFSPS,KRVSQW,HSDRGQ,LDVEER,SAKELR,YQRRPD,YYQRRP,FWQHSD,HGIDVS,RVSQWL,WQHSDR,AVLDVE,LQFAFI,FAFIKA,RFWQHS,KKPIIY,HYYQRR,DVEERG,WVAHYY,DVSRWQ,WRFWQH,AHYYQR,DYIHFY,AYHYFS,KPIIYS,FNEYPW,QHSDRG,YTIHGI,VEERGK,LSAKEL,KLSAKE,ERGKLS,RLFLQT,FFYSTA,FLQTVD
GH31 1 NP_418113.1 55.6 70 GLGERF,SGFGFW,PAWSFG,SRLHGS,CFKTDF,FPVHWG,GFWSHD,GVDCFK,WLTTSF,SHDIGG,WSFGLW,LTTSFT,HDIGGF,PALPPA,FGLLSS,SLRGGL,AWSFGL,MAESLR,AFGLLS,LSSHSR,ESLRGG,WSHDIG,FGFWSH,SHSRLH,FKTDFG,DFGERI,GLWLTT,AESLRG,VFHFDC,YGLGER,GFGFWS,FGLWLT,LWLTTS,CVWINP,HWGGDC,PPAWSF,DRQYML,KNIPFY,SFGLWL,TDFGER,VHWGGD,LRGGLS,DCFWMK,HSRLHG,LFARSA,DIGGFE,SSHSRL,HFDCFW,FWSHDI,GLLSSH,PVHWGG,GERIPT,LDRQYM,VDCFKT,SYRVPW,LLSSHS,LPPAWS,VWINPY,FHFDCF,DCFKTD,WINPYI,LGERFT,VYGLGE,KTDFGE,FDCFWM,FGERIP,RQYMLG,YKNIPF,TTSFTT,ALPPAW
GH31 4 NP_418314.1 55.6 70 GWMADF,LDDGLA,FFMRAG,IGGYTT,LFLHYE,DFGEYL,LLEKLT,RQPELP,ADFGEY,WSLDDG,LWAKCN,GCGEQF,DGLASV,PLFLHY,PQPTFV,PELPDW,QNVDWS,SEQGVG,RTHEGN,QPTFVS,GDQNVD,EYLPTD,AGGDYY,VDWSLD,SLDDGL,FPQPTF,THEGNR,HEGNRP,LLGRQP,MWAGDQ,INPYVA,PPVFYR,WPALWA,QGVGRN,MHNAWP,GLHHSD,WMADFG,WAKCNY,TSFGKR,CGEQFS,PLWTSE,MADFGE,LFFMRA,EQGVGR,DQNVDW,GGYTTL,GEYLPT,IMHNAW,GIWAQD,FPLWTS,DDGLAS,YINPYV,DIGGYT,RPLFLH,HSDIGG,FGEYLP,WAGDQN,SDIGGY,MRTHEG,YGCGEQ,DWSLDD,GVGRNK,LGIQGG,GRQPEL,GIQGGT,AGDQNV,HHSDIG,LHHSDI,IWAQDW,IYGCGE
GH37 1 NP_415715.1 42.0 64 YPLQDG,GWTNGV,LWPVLT,PYVVPG,WAPLQW,PNGWAP,GGEYPL,LSRSQP,LLNRYW,PGGRFR,SLLPLP,QDGFGW,GWAPLQ,YFTMLG,KLVEKY,EYPLQD,WDAPNG,DSYFTM,IVPVDL,LGLAES,WTNGVT,GHIPNG,SQPPFF,HIPNGN,VPVDLN,SGWDFS,VPGGRF,RSQPPF,FGWTNG,YYLSRS,GFGWTN,DAPNGW,GGGGEY,FTMLGL,GGGGGE,WDFSSR,PLQDGF,GGGEYP,NGWAPL,AASGWD,GGRFRE,IPNGNR,GEYPLQ,LQDGFG,WDSYFT,LNRYWD,GWDFSS,MLGLAE,YWDSYF,GLAESG,QQWDAP,YVVPGG,SRSQPP,DFSSRW,APNGWA,YLSRSQ,TMLGLA,VVPGGR,NRYWDD,SYFTML,DGFGWT,YYWDSY,QWDAPN,WPVLTR
GH37 1 NP_417976.1 31.6 47 YYWDSY,MLGLAE,LGLAES,HIPNGN,GLAESG,DSYFTM,NRTYYL,NRYWDD,LQDGFG,YWDSYF,DGFGWT,GGEYPL,AASGWD,LLNRYW,FGWTNG,PLQDGF,YFTMLG,PNGWAP,WDSYFT,SYFTML,LNRYWD,GGGGEY,SRSQPP,LSRSQP,NGWAPL,IPNGNR,GWTNGV,VPGGRF,WPVLTR,YLSRSQ,GHIPNG,TMLGLA,GGGEYP,YPLQDG,GEYPLQ,PNGNRT,NGNRTY,GNRTYY,LWPVLT,FTMLGL,GFGWTN,GWAPLQ,QDGFGW,WAPLQW,EYPLQD,RTYYLS,YYLSRS
GH38 22 NP_415260.1 58.1 64 WNMLNY,GQSRTF,GRPSGI,LLRGVG,PVWNML,SALKKA,KEDLLL,GKEDLL,LLLRPG,MKLNKA,WKEAPV,LTPVQC,LKKAED,QCYNKI,TQFGSL,KMPVPD,GWKEAP,ADTQFG,VGLLGK,LPGQSR,GLREFE,PSGIKM,LREFEV,GLLGKE,DTQFGS,EGLREF,RPGRPS,TLLRGV,RPSGIK,FEVIGE,APVPVW,TVAFSR,PGRPSG,NKIPWD,LLRPGR,KKAEDR,VQCYNK,VLADTQ,LLGKED,WLTPVQ,VPVWNM,ETMMDE,CYNKIP,QSRTFS,ALKKAE,VWNMLN,LADTQF,GVAQQA,LRGVGL,PGQSRT,AWLTPV,EGWKEA,CDATVA,AGVAQQ,DLLLRP,PVPVWN,NQADDH,EFEVIG,EDLLLR,YNKIPW,LRPGRP,DNQADD,DATVAF,LGKEDL
GH43 1 NP_414805.1 18.4 40 TSTFEW,GFDPSL,CPLGRE,DPSLFH,DGKFWL,IATSTF,FDPSLF,GFNPDP,YYIATS,GAFVGL,GEDYYI,DDDGRK,LFHDDD,LSYADG,FFTGAF,EDYYIA,GFFTGA,DYYIAT,STFEWF,APCLSY,TGAFVG,WAPCLS,SGGIWA,FTGAFV,SLFHDD,DMKGNP,FNPDPS,PSLFHD,WFPGVR,TFEWFP,ATSTFE,VARRWQ,YIATST,GGIWAP,HDDDGR,LDMKGN,PLGRET,FHDDDG,FEWFPG,EWFPGV
GH63 2 NP_417551.1 56.6 70 MAHFNP,KLVAYH,RWFSGN,ETLNGN,TLNGNW,WHGHLL,AWHGHL,PGALVQ,AMAHFN,VPLGTA,WSPLFN,QESVDQ,VAYHDW,KPSLAA,AHLYML,PLFNGA,AYHDWW,SWSAAH,VTPSVT,ASYMYS,YHDWWL,PKLVAY,GNWNER,NERNTK,LVAYHD,LGTAAL,QASYMY,FSWSAA,LNGNWR,MYPKLV,WNERNT,YWRGRV,SVDQAS,PEGWSP,GVPEYG,HLYMLY,RVAVKA,NFSWSA,NWNERN,YPKLVA,IYWRGR,NTKPSL,TKPSLA,DIYWRG,EMYPKL,PIVERG,ERNTKP,RNTKPS,AAHLYM,GWSPLF,NGVPEY,WRGRVW,PSLAAW,ANGCAG,LFNGAA,EGWSPL,LQESVD,WSAAHL,SYMYSD,GGNWNE,GAWHGH,SPLFNG,GPEGWS,HGHLLP,DQASYM,VDQASY,VPEYGA,PLGTAA,YMYSDN,ESVDQA
GH65 1 NP_415832.1 10.0 33 GYKGHV,IAAKGL,GLTGEG,GALFPW,KGLTGE,TGEGYK,GNGYLG,VIGPDE,YKGHVF,QILKQA,IHDSSL,EGYKGH,AKGLTG,AAKGLT,GHVFWD,SLSKAI,SSLSKA,GPDEYT,KGHVFW,VFWDTE,GEGYKG,LTGEGY,DSSLSK,DVIGPD,TIHDSS,ILKQAD,LKQADV,FPWESA,PDEYTE,HVFWDT,KQADVV,IGPDEY,HDSSLS
GH73 1 NP_415599.1 13.5 39 LYTSMY,MMLKSM,LESGWG,GYATDP,RLYTSM,FVQMML,MLKSMR,EGMFVQ,TEITTT,QQIAQQ,SMYDQQ,AGYATD,TRLYTS,AALESG,TSMYDQ,LAQAAL,GMFVQM,AQAALE,QVEGMF,QAALES,KSMRDA,ILAQAA,QMMLKS,ESGWGQ,ALESGW,KGLGLA,EITTTE,YTSMYD,TTTEYE,MYDQQI,YDQQIA,SGWGQR,MFVQMM,VEGMFV,ITTTEY,DQQIAQ,VQMMLK,QIAQQM,LILAQA
GH73 5 YP_026230.1 56.6 70 TMAAAE,LRKYPS,AAESGW,NNNLFG,AAAESG,AESGWG,NNLFGM,PDLRKY,RKADQE,WGTSKL,EYSRNS,GTPRKK,NLFGMK,KLKGYS,QDNQRL,GKVKGY,TPRKKA,ERVDII,LKGYST,MVATMA,KVKGYS,NTHPAY,RKYPSG,AMYQDN,RVDIIP,ATMAAA,GWGTSK,LLERVD,GTSKLA,NLNTHP,ADQEVT,KADQEV,SLPDLR,LIAAHM,ESGWGT,PSGTPR,YPSGTP,PRKKAF,SMVATM,LERVDI,QLRKAD,NQRLIA,FAMYQD,SGWGTS,DQEVTA,SGTPRK,LNTHPA,DLRKYP,YLFAMY,DNQRLI,LFAMYQ,QRLIAA,LFGMKC,MIHKLK,HKLKGY,VATMAA,TSKLAR,RLIAAH,LRKADQ,SFRKSR,IHKLKG,KYPSGT,YNNYLF,MYQDNQ,NYLFAM,LPDLRK,RKKAFL,NNYLFA,MAAAES,YQDNQR
GH77 1 NP_417875.1 35.6 69 LRIDHV,IGEDLG,ALRIDH,WLQWLA,ILALES,QPEDWL,ASVGAP,CGALRI,LNPIHA,RIDHVM,GPLGQN,DLAVGV,NWGIGD,VQLYTL,VNIPGT,GIGDFG,ASPYSP,APPDIL,DILGPL,GLYRDL,YPVDDL,YSPSSR,SRRWLN,MVIGED,LPPMDP,QNWGLP,CMVIGE,PSSRRW,RLWWIP,LGLQPE,GLNPIH,NWRRKL,GQNWGL,SSRRWL,SPSSRR,LGPLGQ,EDLGTV,RDLAVG,WGIGDF,GDFGDL,PVDDLL,IGDFGD,PYSPSS,GLPPMD,HDLPTL,GAPPDI,SVGAPP,NPIHAL,THDLPT,VGAPPD,PPDILG,WGLPPM,GALRID,LYRDLA,LYTLRS,PDILGP,VIGEDL,GEDLGT,LAVGVA,YRDLAV,NWGLPP,QLYTLR,LRLWWI,SPYSPS,LGQNWG,ILGPLG,LLRANM,DLGTVP,PLGQNW
GH77 16 NP_417889.1 10.7 12 DGFRFD,TAHDGF,AHDGFT,YWGYNP,TGCGNT,NYWGYN,IAEPWD,GFRFDL,VTAHDG,HDGFTL,VDGFRF,KLIAEP
GH102 2 NP_417293.1 51.5 70 IKGQHF,RGQQYK,FVFFKP,LDVGGA,LIDRGE,HFDIYQ,MVALDV,ASAVPL,NPSFVF,QHFDIY,GQQYKD,AIKGQH,FDIYQG,VASDRS,PTDRGQ,GQHFDI,YRSIGK,IDRGEV,EVPLLD,LMVALD,VQFTGY,NHYGRV,SMQAIR,RLMVAL,HYGRVW,QFTGYY,QQYKDG,LLAEVP,DRGQQY,DNYGNV,ALDVGG,GRVWVL,KVLIDR,GAIKGQ,DMSMQA,VALDVG,DVGGAI,IGKVLI,GASAVP,LLEQNP,LEQNPS,VLIDRG,RVWVLK,NSLMDN,SLMDNF,QNPSFV,GGAIKG,YNHYGR,SIGKVL,ELLEQN,MSMQAI,KGQHFD,ASVASD,GNVQFT,PVKGAS,TGYYTP,GKVLID,VQGSGY,RSIGKV,PSFVFF,VGGAIK,EQNPSF,YGRVWV,VRELLE,NVQFTG,TDRGQQ,SFVFFK,FTGYYT,GYYTPV,SVASDR
GH103 2 NP_417181.1 33.1 64 TPDNVQ,GVETRW,DALATL,TRYNHS,FITPDN,QFMPSS,YGVPPE,GAMGYG,THYAMA,AMGYGQ,VGIIGV,IGSVAN,ETRWGR,GRVMGK,MAVWQL,HYAMAV,AVWQLG,WYGLPN,GQFMPS,YGQFMP,VDAIGS,GSVANY,YAMAVW,IGVETR,EIIVGI,RVMGKT,PEIIVG,RYNHST,MGYGQF,NHSTHY,WGRVMG,SVANYF,GIIGVE,VANYFK,IIVGII,STHYAM,AMAVWQ,SLLRLD,AIGSVA,YWYGLP,TRWGRV,VPPEII,VETRWG,LETFLL,MGKTRI,YQYWYG,AGAMGY,IVGIIG,NVQNGV,HSTHYA,GVPPEI,YGLPNF,RWGRVM,QYWYGL,YNHSTH,DAIGSV,GLPNFY,GDGHIN,ITRYNH,IIGVET,VMGKTR,PPEIIV,GYGQFM,PVDAIG
GH153 1 NP_415542.1 14.0 35 SEAWFA,YYPDNF,KNYGYY,LFHDDA,NYGYYP,QLNGVK,WMSLLQ,NGQHQA,LNGVKN,YGYYPD,AIMAMP,IMAMPY,GILFHD,GVKNYG,PQAKDK,LQLNGV,LKSYDW,QNYADF,GYYPDN,NYADFL,EAWFAQ,ILFHDD,DWTAIM,AWYPKN,LLQLNG,ARNIFA,AWFAQN,VKNYGY,ESEAWF,AQNYAD,WTAIMA,FLKSYD,PESEAW,NGVKNY,TAIMAM
AA2 3 NP_418377.1 70.0 70 GRVDAR,FARAWF,ILAGNV,QTRSPA,EFEKIS,PDPFDP,DGRGGA,KARRLL,AWLTHR,FIRMAW,LFEGRD,EDLIWQ,YVNPEG,FDPSKK,FRGGDK,IRMAWH,EAVDAP,AIQFEA,NDPQAF,LGEDFD,DPSKKR,WLTHRH,GEDFDY,RGQQRF,KAWLTH,PIKQKY,APGRVD,CPFHQG,SGEPLS,PEALAK,TRSPAG,NQLRVD,MGLIYV,DLDVNW,PKEDLI,TLTAPE,ADIIVL,NQHSNR,LENSGF,QLRVDL,ARRLLW,PLNSWP,EPLSAA,AAAAIR,MRYEWK,SGFRTF,FGFGAG,KAQQLT,KEFSKL,KAPLGA,PQRDWD,GSNSVL,NSVLRA,GKCPFH,TFGFGA,WSNYFF,LLEPIA,ARLDVS,KLTHRD,LDMRYE,WPNQLR,GARLAL,FSKLDY,VMNLDR,EFSKLD,ARAWFK,EDVWEP,HPEALA,DYRKEF,YEWVQT
AA3 1 NP_414845.1 11.4 33 IGAGSA,ENLQDH,LPGVGE,LLEAGG,HFLPVA,QEGFGP,GTCKMG,EAGGFI,RGKGLG,NLQDHL,GLGGSS,VDASIM,QMPAAL,MPAALA,GAGSAG,LQDHLE,EAGGPD,LRVVDA,DASIMP,EGFGPM,LLLEAG,GENLQD,VLLLEA,GLRVVD,PGVGEN,VVDASI,LEAGGP,GVGENL,QQEGFG,QYHFLP,RVVDAS,IQYHFL,VGENLQ
AA6 1 NP_415524.1 23.2 62 ADYDAI,QMRTFL,TFLDQT,ELADYD,DYDAII,TGTGGG,GHIETM,GTPTRF,DAIIFG,VLYYSM,TIAGGD,AKVLVL,GTPYGA,GTGGGQ,DQTGGL,IFGTPT,YYSMYG,YGHIET,YSMYGH,KVLVLY,TPYGAT,TLAHHG,PYGATT,LADYDA,YDAIIF,HIETMA,TRFGNM,GGTPYG,TGGGQE,YGATTI,LVLYYS,GGDGSR,FGTPTR,RTFLDQ,SMYGHI,MAKVLV,AGGDGS,AIIFGT,TTLAHH,QTGGLW,TGGLWA,DGSRQP,IAGGDG,TPTRFG,STWTTL,KRVPET,RVPETM,MYGHIE,GDGSRQ,GATTIA,GQMRTF,VLVLYY,WTTLAH,IIFGTP,STGTGG,LYYSMY,AVAEGA,FLDQTG,LDQTGG,GSRQPS,MRTFLD,PTRFGN
AA8 3 NP_414845.1 5.32 12 VDASIM,NLQDHL,DASIMP,GLRVVD,LLEAGG,GAGSAG,VVDASI,LLLEAG,RVVDAS,GTCKMG,VLLLEA,LRVVDA
GT2 3 NP_415567.1 49.8 70 NLMNFR,IFYRRR,LDADSV,FTAGLH,YWGHNA,VWIAYD,VLDADS,VSAGFW,VFAGLR,HDFVEA,DFCRRW,TRVYGP,GNLMNF,VHRAVF,RVYGPL,FVEAAL,RDRRWC,VEAALM,WGHNAI,ICNEDV,WQLGES,FCRRWG,PPNLLD,SAGFWT,AGFWTA,PICNED,RAGWGV,LMRRAG,LGESHY,QLGESH,MPICNE,ILSHDF,YGPLFT,SAPLWF,LMNFRL,HYWGHN,RRAGWG,GPLFTA,EAALMR,ESHYWG,WVSAGF,LSAPLW,HCALAP,LLFLPK,HNAIIR,PNLLDE,KRDRRW,PLFTAG,LSHDFV,AALMRR,FAGLRA,GHNAII,DFVEAA,VYGPLF,GESHYW,LFTAGL,SHYWGH,LAPLPG,MNFRLF,ALMRRA,KRKSGN,DDFCRR,DADSVM,RVFAGL,WIAYDL,NLLDEL,SHDFVE,YLSAPL,AGLRAT,MRRAGW
GT2 6 NP_417990.4 46.7 69 SPDPFE,IGQRIR,TPHHFF,YETVLA,GQRIRW,VQDGND,TFFCGS,CDHVPT,LAWYIA,RWARGM,IFDCDH,LFYGLV,FDCDHV,FFSPDP,AGLATE,ATESLS,LATESL,HVPTRS,VPTRSF,DCDHVP,FLTAPL,FFCGSC,HFFSPD,PHHFFS,ETVTED,TRSFLQ,TFARAD,FCGSCA,HAKAGN,DHVPTR,KAGNIN,PQAAGL,GGIAVE,IAVETV,YGLVQD,RIRWAR,FNVTAK,NVTAKG,VETVTE,PDPFER,HHFFSP,QTPHHF,QRIRWA,PTRSFL,QAAGLA,VTEDAH,ATFFCG,GKFNVT,AKAGNI,FERNLG,LTAPLA,GIAVET,AAGLAT,IPQAAG,DPFERN,AVETVT,TVTEDA,TEDAHT,PFERNL,FYGLVQ,FSPDPF,GLVQDG,CGSCAV,VTAKGG,HIGQRI,IRWARG,LVQDGN,GLATES,VLAWYI
GT2 7 NP_416757.1 42.0 63 LFAVLF,RARPRY,VVIPVY,YGCMLR,TLDADL,NRNYGQ,TCLTTT,PEEIPR,GLLGEY,GCMLRA,GEYIGR,DDGSSD,DADLQN,LQNPPE,HERSTF,LGEYIG,QNPPEE,NPPEEI,GMGLLG,CHERST,LDADLQ,YDVVGT,ADLQNP,LRAYRR,ARPRYF,STFIPI,LINLMY,CMLRAY,TTTPLR,CLTTTP,VRARPR,VIPVYN,GYDVVG,EEIPRL,ITLDAD,PPEEIP,MGLLGE,VSVVIP,FIPILA,RSTFIP,DYGCML,SVVIPV,MLRAYR,DVRARP,EIPRLV,IPILAN,GESKYS,ERSTFI,FIGAQF,VVGTVR,TFIPIL,LLNRNY,RNYGQH,LNRNYG,NLMYDL,AIMAGF,LLGEYI,INLMYD,YIGRIY,EYIGRI,DLQNPP,DVVGTV,IPVYNE
GT2 15 NP_415541.1 34.5 60 ITEDID,SIIGLI,ILMPET,AQGGAE,RWAQGG,GDALLD,EPRALC,GNPRIR,IIGLIK,LWKQRL,IRTRST,RTRSTL,IGLIKR,RWVSPD,AVTGNP,FTVSGV,ISWKLQ,VSPDRG,IKRTQR,WILMPE,WVSPDR,QGGAEV,LIKRTQ,WKQRLR,VCIDGD,KQRLRW,DISWKL,GAVTGN,PRIRTR,DGDALL,RARWVS,RLRWAQ,VGAVTG,WAQGGA,NPRIRT,CWILMP,TVSGVI,SWKLQL,TEDIDI,TGNPRI,EDIDIS,SSIIGL,IDGDAL,SPDRGI,LRWAQG,CIDGDA,QRLRWA,GLIKRT,MITEDI,DMITED,VTGNPR,RIRTRS,LCWILM,DIDISW,LVCIDG,GLWKQR,ARWVSP,PRALCW,IDISWK,LMPETL
GT2 43 NP_415101.1 45.6 69 VCARPG,DVQSLT,KVVCAR,PGPTSK,VPSAGV,DIGFRL,ADCLNN,RVAWDK,KPLAIM,GTCFSR,GIVFQG,HKVVCA,VGTYPN,SAGVGT,GFILHD,NVHKVV,PTSKAD,RPGPTS,ARPGPT,QKSRWI,LRLFNY,YFLWRD,AGVGTC,PLAIMV,TCFSRR,IFVRFP,KADCLN,LHDAED,CARPGP,TSKADC,QVPSAG,LTEDYD,CFSRRA,AIMVPA,GTYPND,GQVPSA,SKADCL,YDIGFR,VVCARP,AWDKTT,NYFLWR,LAIMVP,FLWRDR,EKPLAI,PSAGVG,TEDYDI,EDYDIG,IMVPAW,WDKTTH,GVGTCF,AGFILH,QSLTED,FVRFPV,DAEDVI,FVGTYP,VGTCFS,ILHDAE,ELRLFN,IFVGTY,VAWDKT,GPTSKA,SLTEDY,RQKSRW,HDAEDV,EDVISP,DYDIGF,RLFNYL,TYPNDP,FAGFIL
GT2 59 NP_416559.1 57.6 70 SLAHLA,AKKVQR,YHSLPA,SHQAIF,VSEFSM,EFSMGG,FSMGGV,DAMNKG,LVSEFS,GLVSEF,ALLDFG,VSSDYA,GFWAEL,DYALAA,GDALLD,LFLNSG,AIFFPV,NSGDIF,AKPGWY,HSLPAS,LNSGDI,KVSSDY,SEFSMG,YDAMNK,NGIYDA,DGGSND,HQAIFF,PGWYIY,SFEWIV,FWAELS,GGSNDG,KALYNK,GWYIYH,LDFGDG,ASHQAI,VVDGGS,IYHSLP,IYDAMN,MGGVST,IVVDGG,PGFWAE,AMNKGI,TVAFRN,WIVVDG,WYIYHS,SGDIFH,FLNSGD,GIYDAM,FEWIVV,DAKKVQ,YIYHSL,PASHQA,SDYALA,ITVAFR,LLDFGD,SLPASH,EWIVVD,VPGFWA,GSNDGT,LPASHQ,VDGGSN,DALLDF,LRFVSE,SSDYAL,SMGGVS,DNGIYD,KPGWYI,ALFLNS,YKVSSD,QAIFFP
GT2 72 NP_418072.1 63.0 70 NYQRHY,VWMGVY,HQANAG,SRRWTH,RHYIKI,AGASVA,KITRLL,QCNADW,IAEIFT,RLLHQA,MGVYRR,ITRLLE,RVCHAV,FVDADD,DVAQCN,GASVAR,HVVWMG,QRHYIK,LNYQRH,WTHVVW,RQRMIA,LLEKLN,PDWLRM,VCHAVR,YIKITR,VRLLHQ,VAQCNA,QRMIAE,NDGSTD,TRLLEK,HAVRKE,KNLNYQ,NAGASV,TLMTMA,EALRVC,MIAEIF,WMGVYR,IFTSGM,ANAGAS,ALRVCH,VVWMGV,AFVDAD,RRWTHV,ALEIII,AQCNAD,LDVAQC,IKITRL,GLHHQD,THVVWM,NLNYQR,RLLEKL,LEKLNR,QANAGA,CHAVRK,MYETLM,LRVCHA,GPDWLR,RWTHVV,RMIAEI,AEIFTS,LLHQAN,YQRHYI,ASVARN,LHHQDI,PLYNAG,YETLMT,ETLMTM,HYIKIT,EIFTSG,YVAFVD
GT2 99 NP_416563.1 66.0 70 DDDDEW,LYQIRN,GACAVR,LKAAQD,QIRNKR,AQDYDI,LYANDY,FTLYQI,KAAQDY,GIDDDD,YQLFTL,FYRKHK,ATQILH,YFHFYR,VRNQAI,LAIRAI,FLYAND,AVRNQA,WNRQQL,IDDDDE,KFDRAS,FHFYRK,MPTWNR,SVLRQD,RQQLAI,HGEMQI,HAFLYA,QQLAIR,IVDDCS,RTLLTL,IFLRMV,GEMQIT,WAWRFK,QDYDIF,DYDIFL,NRQQLA,VLRQDY,YGEPWK,DIFLRM,PTWNRQ,FYKRNI,QLAIRA,CAVRNQ,IRNKRM,FLRMVV,QLFTLY,IIVDDC,TWNRQQ,LPLYPK,KYQLFT,AFLYAN,NHGEMQ,YQIRNK,DDDEWT,PLYPKS,ACAVRN,SLPLYP,EATQIL,TLYQIR,GYFHFY,GNQVFT,TGIDDD,LFTLYQ,HFYRKH,WRTLLT,AAQDYD,YMPTWN,KFSGYF,LYPKSP,YDIFLR
GT2 173 NP_416852.1 16.5 35 VDLQDP,IDVDLQ,VFINDG,PIDVDL,FGKEPA,KISLVV,RKTAEW,SLVVPV,KTAEWF,EIVFIN,GKEPAL,EPALFA,MKISLV,IPIDVD,FINDGS,TRNFGK,VPVFNE,INDGSK,LSFTRN,ALFAGL,PVFNEE,PALFAG,KEPALF,PIEVIP,IVFIND,DPIEVI,KRKTAE,FTRNFG,SFTRNF,DVDLQD,DLQDPI,NFGKEP,NDGSKD,TAEWFY,RNFGKE
GT4 18 NP_418088.1 21.5 24 GGLQRD,DFMRIA,GYAHYI,AGLPVL,QRDFMR,LQRDFM,GFNKMP,FGGLQR,FPFGGL,VCGYAH,FNKMPG,NKMPGL,KYFPFG,GLQRDF,LYKYFP,RDFMRI,YKYFPF,PFGGLQ,KMPGLD,CGYAHY,YFPFGG,KGVDRS,MPGLDV,PGLDVY
GT4 21 NP_416554.1 58.7 70 EKQGLE,HIQDYE,IQDYEV,EVDAML,FFPNWS,DAMLGL,NAVITA,IGKYTG,SKLTNI,RCPLYV,VLPSKL,LYSGNI,NIGEKQ,IGEKQG,ILAVGG,LAVGGN,LHIQDY,TNILAV,AMLGLG,DYEVDA,VRVITA,GQGGGK,APPYYP,PELTGI,DAVLPS,LPSKLT,QDYEVD,ADAVLP,SGNIGE,KLTNIL,RVITAP,SSFFPL,GEKQGL,AVLPSK,KYTGEM,VGQGGG,ELTGIG,TAPPYY,FPNWSE,GIGKYT,GINYSP,ITAPPY,VDAMLG,YEVDAM,WRCPLY,GKYTGE,VYGINY,VPTLFC,TELGQL,NYSPEL,INYSPE,CPLYVP,GAADAV,ILVYGI,VITAPP,GNIGEK,VVPTLF,LTGIGK,TGIGKY,AADAVL,LTNILA,SPELTG,YSGNIG,LVYGIN,ELGQLC,YSPELT,VWRCPL,PSKLTN,YGINYS,NILAVG
GT4 22 NP_416548.1 60.8 70 DMEGIP,SVARLT,GFKPSH,AVGIPV,PSHEVK,ARLTEK,FFLLKF,ISVARL,AKLREL,VFIAHF,IPVALM,AIEACR,FLLKFP,YTPEYQ,AAKLRE,GLHVAI,EAMAVG,EKKGLH,EGIPVA,LTEKKG,TAAKLR,FKPSHE,VAIEAC,PAGVTA,DVFIAH,TFVLNQ,KGLHVA,FVLNQI,MAVGIP,HFGPAG,SGIPEL,GDMEGI,KLRELG,VFLLPS,VTAAKL,RLTEKK,VARLTE,GVTAAK,DVFLLP,AGVTAA,AHFGPA,GPWERR,FGPAGV,PWERRL,LRELGV,KKGLHV,SETFVL,ALMEAM,DGDMEG,PGFKPS,LHVAIE,LLKFPL,SRMGVD,IAHFGP,MPGFKP,GIPVAL,LMEAMA,MEAMAV,PVALME,ETFVLN,FIAHFG,ADVFIA,GWLVPE,VSRMGV,MEGIPV,AMAVGI,GPAGVT,VALMEA,HSGIPE,TEKKGL
GT4 29 NP_416561.1 67.3 70 CEALSI,SIGVPV,AGVALD,NYPLIL,SPSQHV,VALDLH,EALSIG,NVRLAE,DNYPLI,SYWLNL,ISPSQH,MLEEYV,HWSVTG,VDNYPL,VWTLHD,SVTGRC,WTLHDH,PSQHVA,VRLAEG,VAHDLR,GGAAGV,SRVDNY,YGKGGK,VTGRCA,HDHWSV,GQQMLE,YGYGKG,HSYWLN,LCEALS,LILCEA,AAGVAL,ILCEAL,FGKFSP,SGQQML,ALSIGV,ILQFNV,EGGAAG,TGRCAF,QFNVRL,HDLRYD,AEGGAA,FISPSQ,LRYDGK,HTFGKF,PLILCE,FNVRLA,GAAGVA,DGCEGW,LAEGGA,WSVTGR,TFGKFS,GKFSPF,RVDNYP,DHWSVT,ATEAIL,INNGID,QMLEEY,GYGKGG,LSIGVP,VLHFHV,RLAEGG,LHDHWS,QQMLEE,RYDGKT,GVALDL,TLHDHW,LQFNVR,AHDLRY,DLRYDG,YPLILC
GT4 149 NP_418085.2 55.3 64 GDGSDF,LRRAKH,PHFSLD,SGFGGM,HFSLDH,FFFCRN,GFGGME,MFFFCR,IAFIGE,QKRVKD,SWPHFS,VSGFGG,EAVSGF,RNDKMD,DHKKHA,FIGEAV,FEGFPM,CRNDKM,ALLLTS,FEKCQA,EGQKRV,GGMETV,EMFFFC,EKCQAY,FLRRAK,ADYHLA,PMTLLE,FPMTLL,AVSGFG,KIAFIG,MTLLEA,AFIGEA,YHLAIS,KARKKS,KFEGQK,KRVKDL,ISSGIK,GEAVSG,WPHFSL,GIPCIS,DGSDFE,ARKKSG,SDFEKC,FSWPHF,FEGQKR,NDKMDK,GFPMTL,EGFPMT,GQKRVK,LAISSG,FSLDHK,SLDHKK,DFEKCQ,HLAISS,FFCRND,HKKHAE,GSDFEK,IGEAVS,LDHKKH,AISSGI,GMETVI,FGGMET,DYHLAI,FCRNDK
GT5 3 NP_417887.1 12.9 37 FEPCGL,SPTYAR,VSPTYA,QKGLDL,LTQLYG,PSRFEP,CGLTQL,HAHDWH,PTYARE,NLAYQG,EPCGLT,HDWHAG,RTGGLA,VRRTGG,GGLADV,SRFEPC,RFEPCG,TGGLAD,TQLYGL,GLTQLY,RRTGGL,AVVSRL,PCGLTQ,TYAREI,VPSRFE,FAVVSR,SFLKAG,GGLADT,KTGGLA,YGTLPL,HNLAYQ,GTLPLV,DWHAGL,TVHNLA,GLADTV,AHDWHA,VHAHDW
GT8 3 NP_418084.1 12.9 25 STKNWT,YNTQFS,SLPSTK,FLFGCG,DQDVLN,PDQDVL,HYIGPT,LYLDAD,PTKPWH,QFSLNY,VLYLDA,YIGPTK,YLDADI,IGPTKP,YFRFVI,KYNTQF,GPTKPW,LPSTKN,PSTKNW,NTQFSL,DKNFLF,KNFLFG,GYFNSG,NFLFGC,TQFSLN
GT8 35 NP_418083.1 52.4 70 LTEKAL,GATKPW,DADVVC,DVVCKG,NYLDGV,VMNVLL,KSELKD,YTIKSE,QYFNSG,NSGVVY,GQYFNS,IKSELK,SELKDK,TKPWHK,PWHKWA,TIYTIK,LYLDAD,WSRAMY,RAMYFR,GVSITS,FKKRYK,LLYLDA,YKHLLV,TQVWSR,HYTGAT,RLLYLD,GVGVSI,KPWHKW,FRLFAF,YFRLFA,SRAMYF,YPDQDV,IYTIKS,YLDGVG,VWSRAM,ADVVCK,KYPDQD,MYFRLF,DQDVMN,EFKKRY,KRYKHL,QVWSRA,TLLIHY,YKYPDQ,FNSGVV,YFNSGV,RYKHLL,LDGVGV,PDQDVM,IHYTGA,AAVVKD,VGVSIT,LIHYTG,YLDADV,NTIYTI,KHLLVQ,ATKPWH,AMYFRL,LNVAYG,TGATKP,TIKSEL,PMQEKA,RLSDPE,LDADVV,YTGATK,KKRYKH,LLIHYT,DGVGVS,VSITSI,LSDPEL
GT9 1 NP_418077.1 33.1 65 HVAAAL,HQSLID,RTGWRG,IGFCPG,SGLMHV,PLGHGA,GWRGEM,DSGLMH,MRYGLL,MSQSLY,PAKRWP,GPSWVG,NSFKSA,GAEFGP,LVPFFA,LGHGAL,NDSGLM,IDVMAP,GLMHVA,CPGAEF,VGDMMM,KRWPHY,MHVAAA,PLVALY,LMHVAA,LPNSFK,GPAKRW,KSALVP,SALVPF,PHYHYA,WCRPLL,AEFGPA,RYGLLN,EMRYGL,SFKSAL,PSWVGD,RGEMRY,WPHYHY,LVALYG,PGAEFG,YVLPNS,GDMMMS,QSLIDI,GYHQSL,YHQSLI,WVGDMM,FTPPLS,YGLLND,TGWRGE,AKRWPH,PNSFKS,IGPSWV,VLPNSF,AYVLPN,GEMRYG,WRGEMR,FGPAKR,MPLGHG,FGSAKD,SWVGDM,EFGPAK,RWPHYH,GFCPGA,ALVPFF,FCPGAE
GT9 2 NP_418078.1 45.1 70 GLSHLT,LSHLTA,TGLSHL,IRRWRK,VKTSSM,VLIVKT,PTDPGL,RVIPVA,KLPWGA,PALTDA,AKSLGY,ALDRPN,EGFAQI,GFAQIP,SMGDVL,LIGGYG,HTLPAL,REPLAS,DPGLIG,QQHAVE,IKLPWG,VIDAQG,SSMGDV,AVERTR,SHLTAA,EEGFAQ,PVAIRR,VFLHAT,GDYAIA,WVVEEG,IDAQGL,MGDVLH,LHATTR,AREPLA,QHAVER,HLTAAL,AALDRP,DWVVEE,KTSSMG,AIRRWR,LFAKSL,TDPGLI,VLHTLP,VVSVDT,VSVDTG,LHTLPA,SVDTGL,IVKTSS,GDVLHT,DTGLSH,VIPVAI,IPVAIR,GPTDPG,HATTRD,FLHATT,LTAALD,DVLHTL,LIVKTS,TSSMGD,VVEEGF,LPALTD,VDTGLS,FAKSLG,TLPALT,VAIRRW,VEEGFA,GLIGGY,FDWVVE,PGLIGG,HAVERT
GT9 3 NP_418089.1 8.58 17 VVIQPT,HGDMLL,IQPTAR,IGVDSA,LTTPVI,DMLLTT,LLTTPV,VIQPTA,ALIDHA,GVDSAP,TTPVIS,MLLTTP,GDMLLT,LFIGVD,QPTARQ,FIGVDS,YVVIQP
GT9 98 NP_418080.1 51.7 69 LHLPVD,IYTVAL,HKKKRF,FDIVLD,NNKLGD,TVALTK,IKINFL,VLDPFE,AKKICR,LCKHLR,YSDFVI,YPSHLI,KRFYIN,ETMPSF,MRLGTF,LGDLIV,LRDLQF,ALVHIA,DLIVLS,PFETMP,INKIKI,IVLDPF,YHPHDE,IHDNNK,GAKKIC,WYKRYY,VISVDT,LGAKKI,SCLIIH,IQIVSP,ISVDTA,PSHLIW,DPFETM,FETMPS,DNNKLG,KKICRL,FIYTVA,EIETLP,GDLIVL,TYTVKD,LDPFET,YTVALT,KIKINF,VSPTYT,TFEQIK,RELYSK,LPQDLL,PSFKHS,KDIDTE,SVDTAL,KINFLS,INFLSF,KLGDLI,TLCKHL,KEHMST,HDNNKL,EFIYTV,TALVHI,DTALVH,LIVLSS,FHKKKR,CLIIHD,NKIKIN,VDTALV,SKGVKI,YSFYHP,NKLGDL,DIVLDP,CRLTFE
GT19 1 NP_414724.1 31.8 61 MVVGYR,VSPSVW,GAGLIR,VSLPNL,PFEKAF,ELAVMG,LLASGT,GDILGA,LASGTA,YVSLPN,WRQKRV,YVSPSV,VWAWRQ,SLPNLL,IDAPDF,EVLGRL,PDVFVG,CRFIGH,FLPFEK,DAPDFN,LPNLLA,TIHYVS,PMVVGY,FIGHTM,PSVWAW,HTMADA,ALECML,LGAGLI,LPFEKA,FVGIDA,IGHTMA,SVWAWR,EKAFYD,LLPGSR,RFIGHT,SPSVWA,ALVAGE,ILGAGL,LAFLPF,SGDILG,GIDAPD,ASGTAA,AWRQKR,SGTAAL,IHYVSP,HYVSPS,WAWRQK,AFLPFE,MEELAV,VLAFLP,VGIDAP,DILGAG,FEKAFY,VEVLGR,ALLPGS,PCRFIG,DVFVGI,VFVGID,GTAALE,GHTMAD,EELAVM
GT20 1 NP_416410.1 16.8 35 MNLVAK,PLRDGM,VTPLRD,RLDYSK,WFGWSG,LHIPFP,AKEYVA,VVVSNR,FFLHIP,GGLAVG,AGGLAV,RDGMNL,TPLRDG,LVAKEY,LVVVSN,GMNLVA,GVLVLS,NLVAKE,NRIGFF,LRDGMN,LDYSKG,FLHIPF,VAKEYV,EYVAAQ,GLVTPL,PGVLVL,RLVVVS,GFFLHI,LVTPLR,DGMNLV,ALIVNP,RIGFFL,DYSKGL,IGFFLH,KEYVAA
GT26 1 NP_418242.1 45.2 64 SRVAGA,YRLLSQ,GGTYDV,FTGHVK,VTVAMG,ADGISV,VAMGSP,MGSPKQ,RLLSQP,GADLWE,EWLYRL,GLEWLY,VNIVGS,KYADGI,GISVVR,PVFLVG,VKRAPK,YDVFTG,TGHVKR,GHVKRA,RVAGAD,ISVVRS,LEWLYR,AINAEK,AMGSPK,IVTVAM,LVAINA,TPVFLV,SQDGYF,YMGVGG,VGSQDG,VAINAE,GSQDGY,DVFTGH,TYDVFT,LGLEWL,LYRLLS,DGISVV,GVGGTY,VFTGHV,WLYRLL,GTPVFL,LYMGVG,GSPKQE,VAGADL,IVGSQD,EGTPVF,GTYDVF,TLVAIN,AGADLW,NIVGSQ,WNVNIV,LVGGKP,TVAMGS,FLVGGK,NVNIVG,GTLVAI,YADGIS,VFLVGG,VSRVAG,VGGTYD,MGVGGT,ALYMGV,HVKRAP
GT28 1 NP_414632.1 25.6 60 VVLGMG,VLVVGG,DVVLGM,GMGGYV,RWLGTA,AAAGLP,WLGTAD,GGHVFP,LGTADR,PDVVLG,EFIDDM,FIDDMA,TGGHVF,AGGTGG,VGNPVR,GSQGAR,PGLAVA,GTGGHV,VLGMGG,AYAWAD,DMAAAY,SGLRGK,GALTVS,MVMAGG,GLAVAH,GPGGLA,GGTGGH,AFPGAF,LVVGGS,MAGGTG,VVGNPV,VSGPGG,SGALTV,GGYVSG,ADRMEA,RSGALT,ALTVSE,LGMGGY,GLRGKG,FPGLAV,GYVSGP,VMAGGT,CRSGAL,YVSGPG,LVPKHG,ISGLRG,PGGLAA,VLHEQN,MAAAYA,MGGYVS,IAGLTN,VFPGLA,HVFPGL,TADRME,LMVMAG,GHVFPG,GGSQGA,VPKHGI,GTADRM,HKDRQQ
GT30 1 NP_418090.1 25.6 56 GSLKFD,MGELML,ETELWP,LILVPR,TPTGSE,LPITVT,LVPRHP,STHEGE,GGHNPL,TTMTPT,RPVWIA,IANARL,PVLMGP,LWPNLI,SVGETL,LLLILV,VPRHPE,TELWPN,RLSARS,ELMLLY,YLPYDL,ILVPRH,MTPTGS,LMLLYG,NQGALQ,LLILVP,RKAPAY,VYLPYD,RGGHNP,VGETLA,GELMLL,GALQRL,GPHTFN,GDTMGE,DTMGEL,VGGSLV,PTGSER,PRHPER,VTTMTP,ELWPNL,IMETEL,HVYLPY,GHNPLE,VSVGET,LPYDLP,AFVGGS,ANARLS,ALQRLL,TMGELM,PHTFNF,TMTPTG,METELW,FVGGSL,GETLAA,NARLSA,RHPERF
GT35 1 YP_026218.1 24.6 40 SGTGNM,ACFLDS,GNGGLG,NGGLGR,ASGTGN,EASGTG,GTLDGA,ALNGAL,TLDGAN,DGANVE,GKEASG,LGNGGL,ALGNGG,QLNDTH,NGVAAL,AGKEAS,SEQIST,RLAACF,EQISTA,EFLIGR,LNDTHP,IQLNDT,GGLGRL,GRLAAC,LGRLAA,LDGANV,LAACFL,GLGRLA,AYTNHT,GTGNMK,HEYKRQ,GANVEI,AACFLD,GVAALH,QISTAG,LNGALT,KEASGT,YTNHTL,KAAPGY,VLYPND
GT35 9 NP_417886.1 29.2 39 YVDCQD,MGYFSS,AMLNIA,RTIKEY,NIANMG,IWHIDP,SYVDCQ,YRSYVD,DCQDKV,DKVDEL,MLNIAN,DHYQVL,ANMGYF,HIDPVR,TIKEYA,WHIDPV,GYFSSD,NMGYFS,YQVLAD,YFSSDR,DYRSYV,LNIANM,LADYRS,VDCQDK,QDKVDE,RSYVDC,IANMGY,HYQVLA,QVLADY,IDPVRL,DRTIKE,CQDKVD,ADYRSY,SSDRTI,SDRTIK,VLADYR,FSSDRT,KVDELY,KAMLNI
GT51 1 NP_417855.4 12.7 38 ARNFFL,EILELY,GGKTGT,FNRATQ,GLPKAP,ASTITQ,AAAQVY,HGYRGP,RHGYRG,LVGGFD,KDEILE,GKTGTT,NKIYLG,ELYLNK,STITQQ,LPKAPS,VGGFDF,DAWFSG,IAGLPK,RRNVVL,TQQLAR,ALVGGF,LYLNKI,KTGTTN,LELYLN,ITQQLA,GASTIT,QQLARN,EDSRFY,NRATQA,ILELYL,RNFFLS,AGLPKA,ATEDSR,GGFDFN,QGASTI,TITQQL,SQGAST
GT51 2 NP_414691.1 24.8 61 LERRNL,LSLDQQ,TVNLGM,PQPAFM,YLTALS,RYSKDR,LLVGMV,STLTQQ,TYLTAL,IRGFPL,NPWRNP,GMVKGA,ELSLDQ,VGMVKG,RPLGVQ,AGRTVQ,VKNLFL,VYLGQS,EVYLGQ,LAGKTG,DRILEL,RGFPLA,AGKTGT,KTGTTN,TQQLVK,LTQQLV,RILELY,SLYYFG,YSKDRI,ALERRN,GSLAKP,LVGMVK,KDRILE,NVPTVN,GFPLAS,ARPLGV,RAMVGG,LYYFGR,VPTVNL,GRTVQG,KNLFLS,NEVYLG,YLGQSG,YFGRPV,QLVKNL,QPAFMQ,ALLVGM,SKDRIL,TLTQQL,LVKNLF,YNPWRN,GKTGTT,ERRNLV,TIASGG,MVKGAS,VRAMVG,PTVNLG,YYFGRP,QQLVKN,RRNLVL,SLDQQA
GT51 3 NP_417014.1 35.6 61 PFGGTL,TGTSYG,LAWKTG,YEDRWF,GRPDGT,SLILGG,PLWRFA,LSLILG,PSRLRP,TLTMQV,RLRPDR,GGTLQG,LEWHLS,LNLPAV,GGSTLT,WKTGTS,AVLPQA,QAPSRL,DYRPGN,LQDVPR,QLEWHL,YRPGNF,QVARLL,LWRFAD,SGGSTL,LLQDVP,EWHLSK,GTPLWR,YLNRAP,PDGTPV,STLTMQ,APSRLR,SRLRPD,SYGYRD,GTSYGY,APFGGT,MQVARL,TSYGYR,GSTLTM,SLLQDV,AWKTGT,HPGVNP,AASWAY,VLPQAP,NLPAVQ,LNRAPF,RAPFGG,TMQVAR,YGYRDA,LTMQVA,GYRDAW,RPDGTP,QDVPRR,NRAPFG,LPQAPS,SLNLPA,FGGTLQ,KTGTSY,YRDAWA,LAVLPQ,PQAPSR
GT51 11 NP_417675.1 26.0 51 NIAEFG,SQQTAK,TVYLNI,IAAEDQ,RILTVY,AAVLPN,RIRGAS,QTAKNL,LTVYLN,PFSAVM,WILRQM,AHSDWV,AVLPNP,IRGAST,LFLWDG,SEAALL,FLWDGR,WDGRSW,GLEAGL,QQTAKN,LWDGRS,LLAAVL,TAKNLF,FPEHWG,AEDQKF,ALLAAV,EDQKFP,YLNIAE,KNLFLW,AAEDQK,EAALLA,LAAVLP,AALLAA,VPFSAV,VYLNIA,NLFLWD,VAHSDW,RKGLEA,AVIAAE,LEAGLT,FGVEAA,YVAHSD,KGLEAG,EHWGFD,LAVIAA,ILTVYL,VIAAED,HWGFDV,LNIAEF,PVPFSA,AKNLFL
GT56 1 YP_026257.1 45.7 61 FATRGD,LWLALL,IWGADL,VGNSGD,LGSDIP,GSDIPH,TILVGN,ARQQGI,WLALLS,DIPHHN,LGYFIF,LIHVLG,PFWQDM,RQQGIG,FARQQG,TLCLLI,HVLGSD,FHGQFN,PTRMDP,CLLIQA,FFHGQF,RFFFHG,IHVLGS,QRFFFH,NPFWQD,WGADLY,HIWGAD,LCLLIQ,LFYPLR,QGIGTL,REAQRQ,WHIWGA,QQFGDT,QQGIGT,PCVLNR,GTLCLL,LVGNSG,EAQRQL,ILVGNS,VLRFFN,NSGDRS,QFGDTV,SDIPHH,FPTRMD,AQRQLA,IGTLCL,VFATRG,GNSGDR,DLGYFI,LALLSG,LRFFND,GADLYE,CDLGYF,VREAQR,VPMGYP,VLGSDI,SGDRSN,FFFHGQ,ATRGDL,TVLRFF,GIGTLC
GT73 2 NP_418081.1 34.2 46 LFKILP,CSCHTI,RREKGG,NLSDDT,RIICSG,YRREKG,FYRREK,PMPSEL,LSDDTA,GLDLTG,IFLSGP,SVPLSK,DLTGSC,DLFKIL,FCKDIS,DVRFLH,KIKFNI,LDLTGS,FKILPF,SFYRRE,TDVRFL,ISVPLS,GFCKDI,KKIKFN,LTDVRF,SGLDLT,IICSGL,AYSLKY,KILPFF,LLISVP,CSGLDL,GYCSCH,SDDTAI,IIFLSG,IAVNGS,RSFYRR,FLSGPT,LISVPL,ICSGLD,REKGGF,MPSELS,IKFNIL,LSGPTS,QIAYSL,SCHTIA,YCSCHT
GT83 1 NP_416760.1 35.5 54 EKPIAG,WVEHIQ,GLRYFE,AKGKLP,KPIAGY,LALAVP,TYAVLD,YFFWVE,AGYWIN,RYFEKP,WHYFFW,LWQPDE,LGLALL,QPDETR,FFWVEH,YFEKPI,FEKPIA,GKLPTY,IGTYAV,KGKLPT,ETRYAE,LLWQPD,KGFLAL,ISREML,GFLALA,AAGLAW,GTYAVL,ALAVPV,PIAGYW,MTKGFL,PLLFFS,FLALAV,VEHIQR,DETRYA,LPTYIL,LLPGAL,RLLWQP,HIQRFA,YAVLDP,RYAEIS,WQPDET,FWVEHI,EHIQRF,APFWYY,AEISRE,EISREM,LRYFEK,PDETRY,SREMLA,TKGFLA,KLPTYI,YAEISR,TRYAEI,LGLRYF
CBM5 17 NP_417797.1 30.0 30 ATAAEI,IYENAW,GKIYEN,KRTATA,WRLKRT,ENAWWV,WVSSTN,TATAAE,NTLSCE,TAAEIS,FDGKIY,AAEISQ,YENAWW,VIFDGK,NDATNP,ATNPWR,SSTNCP,KIYENA,RTATAA,TNPWRL,WWVSST,NPWRLK,LKRTAT,DATNPW,PWRLKR,RLKRTA,QVIFDG,VSSTNC,AWWVSS,IFDGKI
CBM34 1 NP_414937.2 18.2 24 RYSFKL,SGQPRR,QPGVTA,VTAWRA,ARLEQF,WFTPQG,GQPRRR,PRRRYS,TPQGFS,FTPQGF,PARLEQ,RRRYSF,GVTAWR,PPARLE,RRYSFK,YSFKLL,EQFAVD,PQGFSR,SFKLLW,LEQFAV,QPQPGV,QPRRRY,RLEQFA,PGVTAW
CBM48 8 NP_417890.1 19.1 30 LLSEGT,GQQNLI,VWVIEP,RALLPD,THLRPY,QQNLID,GLEVRA,DDPYRF,DPYRFG,NLIDDP,YRFGPL,LLPDAT,WHGQQN,EGTHLR,LSEGTH,WLLSEG,LEVRAL,PYRFGP,HGQQNL,ALLPDA,VRALLP,QNLIDD,EPKTGR,RYQLAV,EVRALL,PKTGRK,GTHLRP,FRYQLA,LIDDPY,HLRPYE
CBM50 4 NP_414747.1 5.18 15 IAKRHG,RVRKGD,RHGVNI,SLSSIA,RKGDSL,SIAKRH,GDSLSS,YRVRKG,KRHGVN,KGDSLS,AKRHGV,DVMRWN,DSLSSI,VMRWNS,VRKGDS
CBM50 5 NP_417151.1 3.07 12 ANKPML,PGQVLR,KIYPGQ,IYPGQV,QVLRIP,GQVLRI,FEANKP,PDKIYP,DKIYPG,IFEANK,YPGQVL,EANKPM
CBM50 9 NP_416370.2 12.3 29 KIGQQL,VVSTGD,WQMDFR,STGDTL,QMDFRK,RETRTY,MDFRKL,RNLKIG,TLSSIL,NLKIGQ,QLSWTL,LNQYGI,DTLSSI,SRRETR,LKIGQQ,YVVSTG,TGDTLS,GQQLSW,QQLSWT,VSTGDT,IGQQLS,QYGIDM,NQYGID,RRETRT,GDTLSS,TRTYDR,ETRTYD,LRNLKI,QWQMDF
CBM50 34 NP_417222.1 3.11 8 VYAGNA,KGIDIA,GNALRG,NKGIDI,GIDIAG,YAGNAL,GNKGID,GGNKGI
CBM50 83 NP_415631.1 25.6 30 VYPIGI,GIGQLG,ELRLYY,GSVLTI,AELRLY,YPIGIG,LEAIAK,GFLALL,LALLQA,SVLTIP,AKKYNV,LLQANP,ALLQAN,AIAKKY,YNVGFL,GSLEAI,LRLYYY,FLALLQ,LQANPG,KKYNVG,IAKKYN,SLEAIA,PIGIGQ,NVGFLA,VTVYPI,DPYVPR,TVYPIG,VGFLAL,GGSLEA,IGIGQL
For example, protein NP_417377.1 which was annotated as GH1 PPR1. Open the file CAZY_PPR_patterns\GH\GH1\GH1_group_ec.txt. Then get the EC numbers 3.2.1.86 and 3.2.1.21 in the first row.
Hi there, so this is the output that dbcan gives for hotpep I assume? Ah I think it may be that the stand alone windows hotpep provides EC numbers directly and not the python version maybe? but EC numbers can befound within the CAZY family functions directory output.
This output is from the standalone hotpep python script. run_dbcan just call the hotpep python script and then get the result from the python-version hotpep above. You can try our code(run_dbcan.py) using the example data(EscheriaColiK12MG1655.fna) and you can get the total same result.
Try this command
run_dbcan.py EscheriaColiK12MG1655.fna meta --out_dir output_EscheriaColiK12MG1655
Okay so the difference some from the python vs windows scripts. That is a shame they are not consistent :( Thank you anyway.
Yes, python version of hotpep only give the PPR information, but the EC numbers of each PPR can be easily extracted and summarized with a few lines of code.
This is true although it sometimes give multiple EC numbers for a PPR, still useful though!
Yes, mutiple EC numbers are followed by a colon and a number, designating the sum of the number of conserved peptides in each characterized protein in the PPR group. The higher this number, the more proteins in the PPR group have the enzymatic activity represented by the EC number. So just use the larger one's EC.
ah okay makes sense. Thank you very much for you help.
You're welcome :)
@Li-Dongyao-ancore Thank you, this problem has been solved in V2.0.5. (revise add() to update()).
@Lamm-a I understand your question now. This is a new requirement which is not needed in our paper. If you really want us to add this information, just tell us, no need of saying an offending word.
Dr. Yin told you that EC number could be extracted from PPR. Or you can wait us for the next release of dbCAN because we will not use the EC number predicted by Hotpep in our next version. Yes, this is not useful for our current version and also for the future version.
For your second question, I now understand that you ask for something like figure 5 (col Functions) of https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1625-9. That information can be easily extracted from PPR. We will add that function in our next release of dbCAN.
Again, thanks for @Li-Dongyao-ancore 's suggestion and the patient answer.
Would be awesome for it to be in the next dbcan release :) Was never my intention to offend I was just curious as to why there seemed to be an informational difference. But as @Li-Dongyao-ancore has helped indicate it is python vs windows that is the problem with Hotpep
@Lamm-a If you're really interested in the prediction of EC numbers for protein sequences, you can take a look at the Ensemble Enzyme Prediction Pipeline (E2P2, https://gitlab.com/rhee-lab/E2P2/tree/master).
@Li-Dongyao-ancore Oh this looks very cool! Do you know how well it performs when looking for CAZYs?
@Lamm-a I'm not very familiar with it, but E2P2 was used in the dbCAN-seq database to predict enzyme EC numbers (for example, http://bcb.unl.edu/dbCAN_seq/sequence.php?genome=GCF_000242595.2&fam=GH13&pro=WP_014455184.1, the Enzyme Prediction tag). So I think it would work well towards CAZymes.
Welcome to try eCAMI, a new tool for EC prediction:
On Thursday, February 20, 2020, Lamm-a notifications@github.com wrote:
@Li-Dongyao-ancore https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Li-2DDongyao-2Dancore&d=DwMCaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=f65eEPN7tgPSqkv5z4zNJA&m=XDWNthKOK9frFP_r5RbuXbowMTkewJUibg-366Ix6bw&s=8WW8q4c_3IgwhDrWrNJYuQw_MUAHPwFHKZ4uKbgtFFM&e= Oh this looks very cool! Do you know how well it performs when looking for CAZYs?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_linnabrown_run-5Fdbcan_issues_32-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAEXNKZVP5NY4F4UM3IIM3XDRDZEFVA5CNFSM4KNYBAM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMMPHDA-23issuecomment-2D588837772&d=DwMCaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=f65eEPN7tgPSqkv5z4zNJA&m=XDWNthKOK9frFP_r5RbuXbowMTkewJUibg-366Ix6bw&s=6SESu7vJEtA1tCrbg1nIi1Fahfg9oV3qy1nEUBqetQI&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AEXNKZUYZM6PSXXH6TMSUFDRDZEFVANCNFSM4KNYBAMQ&d=DwMCaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=f65eEPN7tgPSqkv5z4zNJA&m=XDWNthKOK9frFP_r5RbuXbowMTkewJUibg-366Ix6bw&s=hVl684tsymEYjcuaIr7Imbo7LFwisRIDF2CzlvFewiM&e= .
-- Yanbin Yin, PhD Associate Professor Computational Biologist Department of Food Science and Technology Nebraska Food for Health Center Quantitative Life Sciences Initiative University of Nebraska-Lincoln Office: Food Innovation Center 253 Lab: FIC 208/317 Tel: 402-472-4303 Email: yyin@unl.edu Web: https://foodsci.unl.edu/yin; http://bcb.unl.edu
The output information a dbCAN run gives you for Hotpep and HMMER is less that a solo run of both would give you. Why is this the case?