serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
https://www.youtube.com/watch?v=WnUVYQP4h44&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=1
MIT License
14.31k stars 2.2k forks source link

running deepface on gpu #251

Closed ashishlal closed 3 years ago

ashishlal commented 3 years ago

I have installed tensorflow-gpu instead of tensortflow after doing pip install deepface. How can I run deepface on a gpu? or be sure that deepface is running on a GPU? DeepFace.verify with 'VGG-Face' takes close to 6 seconds. I would like to reduce this by using a GPU.

serengil commented 3 years ago

you need to install tensorflow-gpu first and deepface second. the trick here is that you should install deepface with no dependencies. otherwise, it will install regular tensorflow instead of tensorflow-gpu.

pip install deepface --no-deps

btw, 6 seconds is too long even in cpu. how do you call verify function? Do you pass pre-trained model to verify function? This will speed you up.

model = DeepFace.build("VGG-Face")
for i in range(0, 100):
     DeepFace.verify(img1, img2, model_name = "VGG-Face", model = model)
ashishlal commented 3 years ago

It takes close to 6 seconds on google colab pro and also on my local GPU. I am using a pretrained model with weights downloaded. This is how I am installing deepface.

!pip install tensorflow-gpu
!pip install gdown --no-deps
!pip install mtcnn --no-deps
!pip install retinaface --no-deps
!pip install utilpack --no-deps
!pip install slack_sdk --no-deps
!pip install pdfrw --no-deps
!pip install pycryptodome --no-deps
!pip install pymysql --no-deps
!pip install deepface --no-deps
img_path1 = '/content/fr/pics_db/a1.png'
img_path2 = '/content/fr/pics_db/a2.png'
img_path3 = '/content/fr/a3.jpg'

model = DeepFace.build_model(model_name='VGG-Face')

%%time 

resp = DeepFace.verify(img1_path = img_path1, img2_path = img_path3, model=model)
CPU times: user 8.83 s, sys: 190 ms, total: 9.02 s
Wall time: 7.62 s
serengil commented 3 years ago

everything seems oke. could you try facenet or arcface? they might be faster.

ashishlal commented 3 years ago

ArcFace takes around 4.5 seconds

Screenshot from 2021-05-29 14-04-44

miladrasooli commented 3 years ago

I have also the same problem, I have an ubuntu 16 24 core cpu and I built tensorflow for it. It takes about 4 seconds to verify faces with Facenet and dlib. Is there a way to boost the performance?

alvarobasi commented 3 years ago

I'm getting the same problem with processing times, with ArcFace and mtcnn getting around 5seconds per execution.

serengil commented 3 years ago

Try passing images as an array to to verify function

resp = DeepFace.verify(["img1.jpg", "img2.jpg", ...], model=model)

alvarobasi commented 3 years ago

I personally use a preloaded image as a numpy array (cv Mat imdecode) instead of image paths.

serengil commented 3 years ago

@alvarobasi so store them in a python list, this will speed you up. Because each call of verify function builds mtcnn model in the background.

resp = DeepFace.verify([
     [img1, img2], 
     [img1, img3], 
     [img2, img3]
]
, model=model)

here img1, img2, img3 are numpy arrays

alvarobasi commented 3 years ago

Im still getting 6s of processing times by passing a python list... As far as I understood, commons.functions.initialize_detector(detector_backend = 'mtcnn') initializes the mtcnn detector. Doesn't this apply to verify function?

results = DeepFace.verify([[selfie_img_np, id_img_np]], model_name= 'ArcFace', model= loaded_model, distance_metric='cosine')

serengil commented 3 years ago

It initializes for a single call of verify function. It seems that the fastest usage lasts 6 seconds in your env.

BTW, I can run 1 seconds it in my macbook.

alvarobasi commented 3 years ago

I don't get it... I am using a V100 GPU instance for running this and I'm getting the same performance as with CPU only. Which resolution are you using for your images?

serengil commented 3 years ago

Here, you can find my testing images: https://github.com/serengil/deepface/tree/master/tests/dataset

alvarobasi commented 3 years ago

Even with these shapes (shape: (65, 105, 3) shape: (170, 130, 3)) I'm getting 2.5-3.5 seconds of processing... I confirmed I am using the GPU as I forced the GPU memory to be full and the program triggered a CUDA OOM error.

serengil commented 3 years ago

I just published deepface 0.0.60. Many production-driven performance issues are handled in this release. Please update the package and re-try.