facebookresearch / kill-the-bits

Code for: "And the bit goes down: Revisiting the quantization of neural networks"
Other
636 stars 124 forks source link

Why the second GPU has memory consumption? #18

Closed yw155 closed 4 years ago

yw155 commented 4 years ago

Hi @pierrestock, I would like to ask you a question that I observed during training, My computer has two GPUs, and I use the first GPU to perform training. But the second GPU has a large memory consumption while GPU utilization is 0. Is this consumption necessary and how to avoid it? Thank you.

pierrestock commented 4 years ago

Hi @yw155,

Thanks for reaching out! Could you please display the complete output of nvidia-smi? In particular, it is possible that you have an unrelated project (notebook for example) that is running and has cached memory on GPU 1.

Pierre

yw155 commented 4 years ago

Hi @pierrestock, thanks for your reply.

The GPU output is here: gpus

You see there are two threads with the same ID 39998 while their GPU ID are 0 and 1, respectively.

I checked the code and there is no unrelated project. The cost of memory on GPU 1 is very large (around 12 GB). So it seems that it is not the model size.

pierrestock commented 4 years ago

Thanks for your reply!

This memory is used when quantizing the layers. Indeed, the compute_distances step in file em.py:80 broadcasts the computation of distances on all available GPUs (see file distance.py) for precisions. You can disable this parallel computation if this bothers you.

Thus, when you are finetuning, only GPU 0 is active and the memory in GPU 1 is not used, it is just cached that has not been cleared. When you are quantizing however, both GPUs should indicate a usage above 0%.

Hope this helps!

Pierre

yw155 commented 4 years ago

Thanks for your detailed explanation.