Open Wenlong0913 opened 5 years ago
@Wenlong0913 hello,I also want to improve the speed ,but there is poor promotion after using cuda. Have you find better method to accelerate?If so,can you please tell me?Thank you very much.
@Wenlong0913 hello,I also want to improve the speed ,but there is poor promotion after using cuda. Have you find better method to accelerate?If so,can you please tell me?Thank you very much.
Nope. Perhaps you can try this implementation. It's faster. https://github.com/Seanlinx/mtcnn
@Wenlong0913 I am facing the same problem. Could you give some hints on how much it is faster than this one? And do you think the key point of low gpu usage is mtcnn itself or in-efficient implementation? Also, in your provided mtcnn, there is no prediction of landmarks. How do you utilize it to align face?
Check out this fork: https://github.com/innerlee/face.evoLVe.PyTorch
In [1]: from PIL import Image
...: from detector import detect_faces
In [2]: img = Image.open('../disp/Fig1.png').convert('RGB')
In [3]: %time detect_faces(img)
CPU times: user 2.85 s, sys: 172 ms, total: 3.02 s
Wall time: 610 ms
In [1]: from PIL import Image
In [2]: from evolveface import detect_faces, show_results
In [3]: img = Image.open('disp/Fig1.png').convert('RGB')
In [4]: %time detect_faces(img)
CPU times: user 255 ms, sys: 6.05 ms, total: 261 ms
Wall time: 42.3 ms
@innerlee so it's all about pillow-simd?
There are lots of code optimization also
@innerlee Does it affect model performance?
Purely speed changes. The bottleneck is not model inference.
@innerlee I'm checking it out. Great work
@innerlee Hello, can u share what's the estimated time for training this repo using full dataset of Celeb-1M? I tried using 4 GPU but my estimated time is too long like 200 days for a single batch. I cannot believe it. After ten hours of training using only 1/3 of data, it's still at the first epoch with batch 1920/411750. Can u share ur training status? Thx
I use the provided weights for inference. Haven't tried training :shrug:
I also have this issue, I'm running the dataset on Tesla K80
To whom it may concern,
This repo provided really amazing tools. Thanks for the great work. I tried face alignment, extract features by using this lib. I found the face alignment may cost 1.3s to process an image. After reading the code, I realized the
mtcnn
is not running on GPU. A a little bit changes were made, e.g.,torch.FloatTensor => torch.cuda.FloatTensor
,Pnet() =>Pnet().cuda()
, etc.This increased the face alignment speed per image from 1.3 to 0.8s. It works, however, the result does not make me satisfied. Is there a way to make the face detection/alignment run faster?
There is another thing make me confused. The GPU usage is very low, 1%~2%. Please see the attachments.
I'm not sure if this is due to I didn't configured the GPU properly or it is just one of the advantages of this library. The installed CUDA version is 9.2, Cudnn version is 7.4. Graphic card is RTX 2070. It reports an error after I run the python code. Can anyone tell me how to fix it?
Again, many thanks for the great work!