Closed gbrown40 closed 1 year ago
4700 length on 16GB gpu simply won't fit I'm afraid. We're looking more memory efficient versions of ESMFold, but timeline to release is unclear
Where are the main memory bottlenecks coming from? Is it just the O(N^2) transformers? I am trying to run some of the smaller ESM-2 models on larger proteins and I am running out of memory on a 32G GPU.
For the LM yes it's the O(N^2) self-attention. For ESMFold there's the O(N^3) in axial attention, but the computation can be chopped into independent chunks to circumvent that, see model.set_chunk_size(128)
instructions in frontpage README
how to deal?Can i use 2gpus?
Hi, facing same issue for a sequence of length 2180. using nvidia l4 x2 gpu (48)
already tried with multiple chunk sizes but still getting out of memory, is there any other way or will the gpu size here not suffice for this sequence length
reference code:
tokenizer = AutoTokenizer.from_pretrained("facebook/esmfold_v1") model = EsmForProteinFolding.from_pretrained("facebook/esmfold_v1")
model = model.cuda()
model.esm = model.esm.half() torch.backends.cuda.matmul.allow_tf32 = True model.trunk.set_chunk_size(8)
tokenized_input = tokenized_input.cuda()
model.eval() with torch.no_grad(): output = model(tokenized_input)
can you please help with this.
When I try to run the esm model with large sequences( > 4700), I get error:
RuntimeError: CUDA out of memory. Tried to allocate 746.00 MiB (GPU 0; 15.78 GiB total capacity; 12.68 GiB already allocated; 718.75 MiB free; 13.63 GiB reserved in total by PyTorch)
I have tried setting the chunk size all the way down to 1 with no improvements. I'm wondering if there are any other ways to reduce memory usage for large sequences.