Open WANG-CR opened 2 years ago
Hi,
This code is specialized for inference only. As you can see, we have tried really hard to make it run under moderate GRAM requirement, at least for inference. During the entire training process, we have used a couple of hundreds Nvidia A100 with 80G of GRAM, so yeah it is indeed costly, but still we need to use gradient rematerialization as well.
Description
Hello, I encountered an issue when I wanted to train on your model. I simply remove the python decorator
@torch.no_grad()
, and this makes 2 problems:in_place=True
is not differentiable. I fixed this problem by inputtingin_place=False
explicitly.For more details, I am using a V100 32G GPU. Would you like to share how you solved these problems during training? Especially how many resources have been used to train OmegaFold, and what is the necessary GPU RAM to fit in the whole model?
To Reproduce
@torch.no_grad()
inomegafold/__main__.py
python main.py INPUT_FILE.fasta OUTPUT_DIRECTORY