OOM issue when working with gradient

Description

Hello, I encountered an issue when I wanted to train on your model. I simply remove the python decorator @torch.no_grad(), and this makes 2 problems:

The softmax function with the argument in_place=True is not differentiable. I fixed this problem by inputting in_place=False explicitly.
Out Of Memory. I tried to truncate the amino acid sequence, but it is always OOM until there are only 15 residues.

For more details, I am using a V100 32G GPU. Would you like to share how you solved these problems during training? Especially how many resources have been used to train OmegaFold, and what is the necessary GPU RAM to fit in the whole model?

To Reproduce

Remove the python decorator @torch.no_grad() in omegafold/__main__.py
Execute python main.py INPUT_FILE.fasta OUTPUT_DIRECTORY

HeliXonProtein / OmegaFold

OOM issue when working with gradient #48

Description

To Reproduce