agemagician / ProtTrans

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
Academic Free License v3.0
1.13k stars 153 forks source link

Speed up via xformers #120

Open mortonjt opened 1 year ago

mortonjt commented 1 year ago

Just in case you weren't familiar with this, there is an xformers library that can allow for a >4x speed up on all transformer operations https://github.com/facebookresearch/xformers

Could be low hanging fruit to speed up the operations in this library :)

mheinzinger commented 1 year ago

Hi Jamie,

thanks for reaching out! - I wanted to try this before answering but obviously it took me way too long. I already gave it a shot few weeks ago but failed to reach some significant speed-up but maybe I did something wrong (used it for translation on the new ProstT5 model).

Have you made positive experience with this using some protein language models?

mortonjt commented 1 year ago

Sorry to hear about that. I haven't tried this out yet for protein LLMs (only tested it out on stable-diffusion), but it is on my radar. Hoping that it could be useful for inference and speed up the embedding calculations (which we're noticing is a bottleneck for protein annotation)

mheinzinger commented 1 year ago

Hm, how many proteins are you trying to label? - From my experience ProtT5-XL-U50 encoder-only in half-precision using batching as described here reaches around 0.1s/protein on average for the 20k proteins human (so around 30m for human).

mheinzinger commented 1 year ago

I had a brief look and I stopped once I hit the following error: AttributeError: 'FeatureExtractionPipeline' object has no attribute 'enable_xformers_memory_efficient_attention' (tried to extract embeddings from the ProtT5-XL-U50-fp16 model from my link in the post above). So not sure whether it is as easily plug-n-play as I had hoped. In case you find some example/tutorial that shows how this should be done for plain Transformers (no diffusion etc), pls send by and I can give it a try. So far, I only found tutorials on how to use this on diffusion models in huggingface (but most likely I just missed the right source)

mortonjt commented 1 year ago

Regarding examples, I first saw xformers being used in https://github.com/Stability-AI/stablediffusion — so yes I only saw this used in diffusion models

We were trying to embed all of uniref at one point, but had to resort to just a subset. Were trying to embed proteins in microbial metagenomes, and those reference databases are often >50M proteins

On Mon, Aug 28, 2023 at 10:02 AM Michael Heinzinger < @.***> wrote:

I had a brief look and I stopped once I hit the following error: AttributeError: 'FeatureExtractionPipeline' object has no attribute 'enable_xformers_memory_efficient_attention' (tried to extract embeddings from the ProtT5-XL-U50-fp16 model from my link in the post above). So not sure whether it is as easily plug-n-play as I had hoped. In case you find some example/tutorial that shows how this should be done for plain Transformers (no diffusion etc), pls send by and I can give it a try. So far, I only found tutorials on how to use this on diffusion models in huggingface (but most likely I just missed the right source)

— Reply to this email directly, view it on GitHub https://github.com/agemagician/ProtTrans/issues/120#issuecomment-1695226820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXNWFRJVOGIAQY2OB73XXRGCZANCNFSM6AAAAAAZGF66MI . You are receiving this because you authored the thread.Message ID: @.***>

mheinzinger commented 1 year ago

Yeah, I see your point. We also ran UniRef50 at one point but only to make predictions, not for embedding extraction (esp. as storing those embeddings becomes expensive quickly). Only things I can recommend (probably obvious but still):