Open mortonjt opened 1 year ago
Hi Jamie,
thanks for reaching out! - I wanted to try this before answering but obviously it took me way too long. I already gave it a shot few weeks ago but failed to reach some significant speed-up but maybe I did something wrong (used it for translation on the new ProstT5 model).
Have you made positive experience with this using some protein language models?
Sorry to hear about that. I haven't tried this out yet for protein LLMs (only tested it out on stable-diffusion), but it is on my radar. Hoping that it could be useful for inference and speed up the embedding calculations (which we're noticing is a bottleneck for protein annotation)
Hm, how many proteins are you trying to label? - From my experience ProtT5-XL-U50 encoder-only in half-precision using batching as described here reaches around 0.1s/protein on average for the 20k proteins human (so around 30m for human).
I had a brief look and I stopped once I hit the following error: AttributeError: 'FeatureExtractionPipeline' object has no attribute 'enable_xformers_memory_efficient_attention'
(tried to extract embeddings from the ProtT5-XL-U50-fp16 model from my link in the post above).
So not sure whether it is as easily plug-n-play as I had hoped. In case you find some example/tutorial that shows how this should be done for plain Transformers (no diffusion etc), pls send by and I can give it a try. So far, I only found tutorials on how to use this on diffusion models in huggingface (but most likely I just missed the right source)
Regarding examples, I first saw xformers being used in https://github.com/Stability-AI/stablediffusion — so yes I only saw this used in diffusion models
We were trying to embed all of uniref at one point, but had to resort to just a subset. Were trying to embed proteins in microbial metagenomes, and those reference databases are often >50M proteins
On Mon, Aug 28, 2023 at 10:02 AM Michael Heinzinger < @.***> wrote:
I had a brief look and I stopped once I hit the following error: AttributeError: 'FeatureExtractionPipeline' object has no attribute 'enable_xformers_memory_efficient_attention' (tried to extract embeddings from the ProtT5-XL-U50-fp16 model from my link in the post above). So not sure whether it is as easily plug-n-play as I had hoped. In case you find some example/tutorial that shows how this should be done for plain Transformers (no diffusion etc), pls send by and I can give it a try. So far, I only found tutorials on how to use this on diffusion models in huggingface (but most likely I just missed the right source)
— Reply to this email directly, view it on GitHub https://github.com/agemagician/ProtTrans/issues/120#issuecomment-1695226820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXNWFRJVOGIAQY2OB73XXRGCZANCNFSM6AAAAAAZGF66MI . You are receiving this because you authored the thread.Message ID: @.***>
Yeah, I see your point. We also ran UniRef50 at one point but only to make predictions, not for embedding extraction (esp. as storing those embeddings becomes expensive quickly). Only things I can recommend (probably obvious but still):
Just in case you weren't familiar with this, there is an xformers library that can allow for a >4x speed up on all transformer operations https://github.com/facebookresearch/xformers
Could be low hanging fruit to speed up the operations in this library :)