serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
https://www.youtube.com/watch?v=WnUVYQP4h44&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=1
MIT License
14.34k stars 2.2k forks source link

No or little improvements with tensorflow using GPU vs non-gpu #333

Closed kyuhyong closed 3 years ago

kyuhyong commented 3 years ago

Hello,

import time
import cv2
import tensorflow as tf
print("Tensoflow version:{}".format(tf.__version__))
from tensorflow import keras
print("Keras version:{}".format(keras.__version__))
from deepface import DeepFace
print("GPU device: {}".format(tf.test.gpu_device_name()))
cap = cv2.VideoCapture(0)
# Check if the webcam is opened correctly
if not cap.isOpened():
    raise IOError("Cannot open webcam")
while True:
    ret, frame = cap.read()
    time_last = time.time()
    if ret == True:
        frame = cv2.resize(frame, None, fx=1.0, fy=1.0, interpolation=cv2.INTER_AREA)
        obj = DeepFace.analyze(frame, actions = ['age', 'gender', 'race', 'emotion'], enforce_detection = False)
        fps = 1/(time.time() - time_last)
        print("FPS : {}".format(fps))
        time_last = time.time()
        print(obj)
        cv2.imshow('Input', frame)
        c = cv2.waitKey(1)
        if c == 27:
            break
cap.release()
cv2.destroyAllWindows()

This little piece of code will open a webcam device and stream image to deepface. When run with tensorflow 2.2.0 it shows about 2.3 frames per seconds

Then I realized it is not utilizing my gpu installed (RTX2060) so I removed deepface and pip installed tensorflow-gpu==2.5.0 I selected 2.5.0 because it is the latest one supporting CUDA-11.4 I installed. I did check import tensorflow as tf shows Successfully opened dynamic library libcudart.so.11.0 then print(tf.test.gpu_device_name()) shows

2021-09-06 15:49:23.539582: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 4345 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
/device:GPU:0

After reinstalling deepface, I run the code again however I am still getting FPS around 2.5 which is obviously not much of improvements.

I also checked nvidia-smi shows over 60% usage as below

$ nvidia-smi
Mon Sep  6 15:53:39 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   61C    P2    78W /  N/A |   5536MiB /  5934MiB |     63%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1179      G   /usr/lib/xorg/Xorg                157MiB |
|    0   N/A  N/A      1679      G   /usr/bin/gnome-shell               66MiB |
|    0   N/A  N/A      2040      G   ...AAAAAAAAA= --shared-files       30MiB |
|    0   N/A  N/A      7101      G   ...AAAAAAAAA= --shared-files       26MiB |
|    0   N/A  N/A      9952      C   python3                          5250MiB |
+-----------------------------------------------------------------------------+

What am I doing wrong here? I appreciate all your hard work!

serengil commented 3 years ago

it actually shows that you already allocate gpu memory

5536MiB / 5934MiB

However, this is not enough for analyze tasks. That's why, I will not recommend you to use gpu with low memory.

BTW, stream function does this with a single line of code. Why do you prefer to code your own implementation?