Open minyoung90 opened 3 years ago
Not sure why CUDA does not release the memory. For me, it works: When I kill a process, CUDA is performing garbage collection and frees the memory. Appears to be an issue with your setup / CUDA.
Regarding the question: This is expected. AMP uses float16 instead of float32, so storing weights requires only half the memory.
Thanks for quick answer! More information, I ran my source code in docker container.
@minyoung90 Do you mean when you suing GPU locally? What GPU do you have. I experience similiar things with rather old GPUs.
@tide90 I used an AWS ec2 (g4dn.xlarge) which has T4.
Relevant for CPU also. After model.encode(query) memory was not cleared.
In AWS, sagemaker
once it happened to me, my mistake was to install a newer pytorch version inside sagemaker containers that already have one version installed that fully works
Better not to add "torch" as dependency
I found when I interrupted during trainning (e.g. ctrl + z), gpu memory was not cleared. So, whenever I restarted training, raised 'CUDA out of memory'. But, when exception is occurred in a source code, memory was cleared. In this case, should I remove the memory in hand? (I killed the process with python)
+question, I found that if batch size is same (e.g. 64), turning off the mixed precision (use_amp==False) consumes more memory. It is natural thing right?