sacdallago / bio_embeddings

Get protein embeddings from protein sequences
http://docs.bioembeddings.com
MIT License
460 stars 65 forks source link

Add support for ESM-2 and ESMFold #218

Open fedorn opened 1 year ago

yxnyu commented 1 year ago

I wonder how to use the bio_embeddings.embed.ESM1bEmbedder() if I just want to get the trained animo acid vectors?

yxnyu commented 1 year ago

I tired several times install it but when I am usting

from bio_embeddings.embed import ESM1bEmbedder

embedder = ESM1bEmbedder()

embedding = embedder.embed("SEQVENCE")

it will show unexpected EOF, expected 4802126 more bytes. The file might be corrupted.

9las commented 1 year ago

You need to delete the file ~/.cache/bio_embeddings/esm1b/model_file and try again. You have to be patient, it takes a long time to run. If you cancel the job you will get the error the next time you try to run it, since the model_file has only been created partially.