Closed yw155 closed 4 years ago
Hi @yw155,
Thanks for reaching out! Could you please display the complete output of nvidia-smi
? In particular, it is possible that you have an unrelated project (notebook for example) that is running and has cached memory on GPU 1.
Pierre
Hi @pierrestock, thanks for your reply.
The GPU output is here:
You see there are two threads with the same ID 39998 while their GPU ID are 0 and 1, respectively.
I checked the code and there is no unrelated project. The cost of memory on GPU 1 is very large (around 12 GB). So it seems that it is not the model size.
Thanks for your reply!
This memory is used when quantizing the layers. Indeed, the compute_distances
step in file em.py:80
broadcasts the computation of distances on all available GPUs (see file distance.py
) for precisions. You can disable this parallel computation if this bothers you.
Thus, when you are finetuning, only GPU 0 is active and the memory in GPU 1 is not used, it is just cached that has not been cleared. When you are quantizing however, both GPUs should indicate a usage above 0%.
Hope this helps!
Pierre
Thanks for your detailed explanation.
Hi @pierrestock, I would like to ask you a question that I observed during training, My computer has two GPUs, and I use the first GPU to perform training. But the second GPU has a large memory consumption while GPU utilization is 0. Is this consumption necessary and how to avoid it? Thank you.