Open cpplyqz opened 1 week ago
I didn't encounter this issue with my RTX 4090
I also experienced a similar error
This is weird since I used 40GB GPU memory A100 as
I see above people's GPU (2080 ti -> memory 11 GB, rtx 4090 -> memory 24 GB).
I think that something uncertain is going on (either I misused my gpu, or there is some racing condition bug/mismatch).
Ok, at least for me, using a100_shared partition (that charges me only real number of gpu that I requested by sbatch) on my linux workstation caused this memory issue.
Using a100 partition (that charges me all 8 gpus) didn't cause the GPU memory issue.
I confirmed this FACT with my repeated checks with different input files at different running times.
Hello, I really appreciate the chai_lab you and your team developed. I have tried the server and compared it with AF3. Now I want to deploy it locally. I have successfully installed it, but there is a problem. My video memory is not enough, and I will get torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 100.00 MiB. GPU . I have four 2080ti graphics cards with 10G video memory. I would like to ask if it supports multi-GPU parallel processing? Can I modify the cuda: parameter in predict_structure.py to achieve the purpose? Thanks!!!