Closed tianxingyzxq closed 2 years ago
RuntimeError: CUDA out of memory, after several steps, have you test it?
my torch vision is 1.8.1, optimizer is AdamW. sgd has the same problem
It is not likely to be leakage problem since I only use pytorch native operators(I did not write cuda kernels for it). Besides, I myself is training my model with it now, and I observe no such problem.
It is also not likely to be problem of optimizers. Would you please provide detail configuration of your platform and description of how to reproduce it?
Have you tried to use a small batch size and did you observe memory usage is increasing every iteration?
I try to reproduce it, and find there is no memory leak at all. My origin train script is wrong. sorry to bother you.
partial_fc_amsoftmax memory leak @CoinCheung