1adrianb / face-alignment

:fire: 2D and 3D Face alignment library build using pytorch
https://www.adrianbulat.com
BSD 3-Clause "New" or "Revised" License
7.03k stars 1.34k forks source link

GPU memory consumption #228

Open moncio opened 3 years ago

moncio commented 3 years ago

Hi to everyone, I'm testing your library. I'm really interested to estimate the facial landmarks. I have a computer with a GPU, and I'm using the blazeface detector, which is the faster as it comments in the documentation. After several tests, the memory is increasing until ~6.5 GB of GPU when I want to extract 2D landmarks and ~8.5 to extract 3D landmarks. I've noticed as well the size network used is always "Large" and there is some other 3 types uncommented, so I'm wondering about the relation of it with GPU performance and consuming because I need to decrease in some way the amount of memory.

Thanks in advance.

1adrianb commented 3 years ago

The memory consumption is likely dominated by the face detection network. You can partially address this by downsampling your images appropriately (note: make sure that you don't downsample to the point where they are missed). For the later 2 there is not much that you could, not without modifying the code at least unfortunately. If you are willingly to adjust the pipeline, you could run each step separately (ie. do all detection, then do all 2D face alignment, then get all the z coordinates, dropping after each step the network from memory). In this case your total usage will be the max of the 3. Of course depending on your application this may or may not be suitable. I will have a look on this again.

Kitty-sunray commented 2 years ago

you can put detector in separate file and run as python detector.py input.jpg output.npy garbage collector will do the job to free memory notice it would require additional file write/read, but no other solution seems to be easy