Closed hannanyi closed 7 months ago
Hi,
Thank you for your question. Even though the number of training parameters is low, the GPU memory requirement is high because we still need to compute and store the gradients throughout the network. I hope this answers your questions!
-Gaurav
Thank you for your answer, I understand now.
Hello! When I run train_pix2pix_turbo.py, I set the batchsize to 2. Is it normal that the GPU takes up 34g when running on the A6000? Because looking at your code, there are not many trainable parameters, and I don’t know why it takes up so many GPUs. thank you