Open akashbahai opened 1 year ago
Did you solve the question? Even when we buy a 40 GB NVIDIA GPU, it still has the problem.
same issue
Hi, @ChrisLou-bioinfo How long is the sequence that you are trying to predict? Is it working for shorter sequences?
@GanQiao1990 Are you able to run it now? I could get it working for shorter sequences, but it didn't work for longer ones (>1000 residues).
Example sequence could work.
RNA(1485nt),Protein(348AA)
I attempted to set the max_split_size_mb parameter to 24, but it appears to be ineffective.
I am trying to use cpu.
Hi, @ChrisLou-bioinfo How long is the sequence that you are trying to predict? Is it working for shorter sequences?
@GanQiao1990 Are you able to run it now? I could get it working for shorter sequences, but it didn't work for longer ones (>1000 residues).
Yep, i could run by using the shorter sequence, but it's could be out of memery by using the long sequence.
Hi, @ChrisLou-bioinfo How long is the sequence that you are trying to predict? Is it working for shorter sequences? @GanQiao1990 Are you able to run it now? I could get it working for shorter sequences, but it didn't work for longer ones (>1000 residues).
Yep, i could run by using the shorter sequence, but it's could be out of memery by using the long sequence.
我正在用cpu运行没有报错,可能时间会长一点,但愿作者能升级一下出现OOM以后自动用CPU。
@GanQiao1990 In that case, you'll probably need a GPU with larger memory.
@ChrisLou-bioinfo I don't think there's a way for the method to know beforehand, which mode should it use. It'll use the default mode unless you specify CPU/GPU specifically. It's possible to select the mode beforehand by looking at the length of the prediction sequence, by putting an If condition maybe.
OK, I even tried on an A100 with 80GB VRAM... it is not working for the ~5k bp RNA and ~500 AA Protein. If we are to chunk the sequences, are we going to have to redo the sequence prep?
Hi, Thanks for creating this tool and providing the code. I am able to run the prediction on the example sequences, but when I am trying to make a prediction for my usecase, I run into 'CUDA out of memory errors'.
Here's the entire error message:
Traceback (most recent call last): File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 346, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 226, in predict
self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alphat, "%s%02d"%(out_prefix, i_trial))
File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 270, in _run_model
logit_s, logit_aa_s, logit_pae, init_crds, alphaprev, , pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/RoseTTAFoldModel.py", line 93, in forward
pair, state = self.templ_emb(t1d, t2d, alpha_t, xyz_t, pair, state, use_checkpoint=use_checkpoint)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, *kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Embeddings.py", line 190, in forward
templ = self.templ_stack(templ, xyz_t, use_checkpoint=use_checkpoint) # (B, T, L,L, d_templ)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Embeddings.py", line 132, in forward
templ = self.block[i_block](templ, rbf_feat)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Track_module.py", line 95, in forward
pair = pair + self.drop_row(self.row_attn(pair, rbf_feat))
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, *kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Attention_module.py", line 453, in forward
pair = self.norm_pair(pair)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward
return F.layer_norm(
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/functional.py", line 2503, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: CUDA out of memory. Tried to allocate 4.01 GiB (GPU 0; 31.75 GiB total capacity; 22.47 GiB already allocated; 3.67 GiB free; 27.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
This error is happening in the last step of the pipeline i.e. end-to-end prediction. I am using a V100 with 32 GB of memory. The length of the RNA is ~1300 NT's and and the protein is ~700 amino acids. Can you please estimate the size of the GPU memory required for such an usecase?