Closed glorioushonor closed 8 months ago
In my conda env, PyTorch-Lightning==2.1.0
As for the GPU, I use two Quadro RTX 5000 with 16x2 GB memory.
In my conda env,
PyTorch-Lightning==2.1.0
As for the GPU, I use two Quadro RTX 5000 with 16x2 GB memory.
That's indeed strange. The combined memory of two RTX 3090 GPUs is 2 * 24GB, and yet I could only set the batch size to 2. Did you perhaps optimize the code further in subsequent updates?
The training was conducted on either my local workstation with RTX 5000 or MPI cluster with A100. The 4-batch should be set for cluster training, and smaller for local workstation.
But anyway, the batch size does not affect the final performance a lot, and actually, you can train the IF-Net++ with smaller resolution (256 โ 128), this won't significantly affect the final performance. And the IF-Nets++ trained with 128 resolution could be directly applied into volumetric shapes with 256^3, as it's fully convolutional.
Hello, I have attempted to retrain ECON_IF and have a few questions to consult with you. 1.Approximately how much GPU memory is needed for training the model? I used two RTX 3090 GPUs for training, but even when I reduced the batch size to 4, it still showed an out-of-memory error. Only when I further reduced the batch size to 2 did the program start training, albeit barely. 2.During the process of successfully running the program, I noticed that having an incorrect version of pytorch_lightning can lead to numerous errors. Although I have resolved the issues on my own, I still suggest that you provide the version you used to facilitate future replication. *The following is the out-of-memory error message.