sacdallago / bio_embeddings

Get protein embeddings from protein sequences
http://docs.bioembeddings.com
MIT License
463 stars 65 forks source link

the bio_embeddings Give me error of Invalid fasta sequence? But I think I could give the correct sequence as input in fasta file. How could I solve this error?(I use '>' before my fasta file.Please give me some guideline dear Dallago Sir) #152

Open faruk17035 opened 3 years ago

faruk17035 commented 3 years ago

image

This is my fasta file sequence:

3VH0A MHLRHLFSSRLRGSLLLGSLLVVSSFSTQAAEEMLRKAVGKGAYEMAYSQQENALWLATSQSRKLDKGGVVYRLDPVTLEVTQAIHNDLKPFGATINNTTQTLWFGNTVNSAVTAIDAKTGEVKGRLVLDDRKRTEEVRPLQPRELVADDATNTVYISGIGKESVIWVVDGGNIKLKTAIQNTGKMSTGLALDSEGKRLYTTNADGELITIDTADNKILSRKKLLDDGKEHFFINISLDTARQRAFITDSKAAEVLVVDTRNGNILAKVAAPESLAVLFNPARNEAYVTHRQAGKVSVIDAKSYKVVKTFDTPTHPNSLALSADGKTLYVSVKQKSTKQQEATQPDDVIRIAL 1OE6A TESPADSFLKVELELNLKLSNLVFQDPVQYVYNPLVYAWAPHENYVQTYCKSKKEVLFLGMNPGPFGMAQTGVPFGEVNHVRDWLQIEGPVSKPEVEHPKRRIRGFECPQSEVSGARFWSLFKSLCGQPETFFKHCFVHNHCPLIFMNHSGKNLTPTDLPKAQRDTLLEICDEALCQAVRVLGVKLVIGVGRFSEQRARKALMAEGIDVTVKGIMHPSPRNPQANKGWEGIVRGQLLELGVLSLLTG 2ETWA GPLGSMNEMENTDPVLQDDLVSKYERELSTEQEEDTPVILTQLNEDGTTSNYFDKRKLKIAPRSTLQFKVGPPFELVRDYCPVVESHTGRTLDLRIIPRIDRGFDHIDEEWVGYKRNYFTLVSTFETANCDLDTFLKSSFDLLVEDSSVEGRLRVQYFAIKIKAKNDDDDTEINLVQHTAKRDKGPQFCPSVCPLVPSPLPKHQTIREASNVRNITKMKKYDSTFYLHRDHVNYEEYGVDSLLFSYPEDSIQKVARYERVQFASSISVKKPSQQNKHFSLHVILGAVVDPDTFHGENPGIPYDELALKNGSKGMFVYLQEMKTPPLIIRGRSPSNYASSQRITVR 1TN9A EKRRDNRGRILKTGESQRKDGRYLYKYIDSFGEPQFVYSWKLVATDRVPAGKRDAISLREKIAELQKDI

Config.yml file is: global: sequences_file: deeploc_data.fasta prefix: simple bert_embeddings: type: embed protocol: word2vec reduce: True

faruk17035 commented 3 years ago

I solved this error dear Sir. Thanks a lot for helping me. I am very grateful to you