cosanlab / py-feat

Facial Expression Analysis Toolbox
https://py-feat.org/
Other
243 stars 66 forks source link

Benchmarking Py-Feat #184

Open ljchang opened 11 months ago

ljchang commented 11 months ago

Many py-feat users have reported slow inference when detecting facial expression features. We would like to find ways to speed this up.

We were hoping users might be able to share some data to help us identify where we should focus our efforts.

First, initialize a detector and load our test images

import os
from torchvision.io import read_image
from feat.detector import Detector
from feat.utils.io import (get_test_data_path)
detector = Detector(verbose=False, device='cpu')

single_face = os.path.join(get_test_data_path(), "single_face.jpg")
single_face_img = read_image(single_face)
multi_face = os.path.join(get_test_data_path(), "multi_face.jpg")
multi_face_img = read_image(multi_face)

print(detector_cpu.info['face_model'],
detector_cpu.info['landmark_model'],
detector_cpu.info['facepose_model'],
detector_cpu.info['au_model'],
detector_cpu.info['emotion_model'],
detector_cpu.info['identity_model'])

print(f'pyfeat_version: {feat.__version__}')
print(f'torch_version: {torch.__version__}')

Next run each individual detector and save the run time in milliseconds for the single_image test.

out = detector.detect_image(single_face, batch_size=1)
faces = detector.detect_faces(single_face_img)
landmarks = detector.detect_landmarks(single_face_img, detected_faces=faces)
poses = detector.detect_facepose(single_face_img)
aus = detector.detect_aus(single_face_img, landmarks)
emotions = detector.detect_emotions(single_face_img, faces, landmarks)
embeddings = detector.detect_identity(single_face_img, faces)

Finally, run each individual detector on the multi_image test and save the run time in milliseconds.

out = detector.detect_image(multi_face, batch_size=1)
faces = detector.detect_faces(multi_face_img)
landmarks = detector.detect_landmarks(multi_face_img, detected_faces=faces)
poses = detector.detect_facepose(multi_face_img)
aus = detector.detect_aus(multi_face_img, landmarks)
emotions = detector.detect_emotions(multi_face_img, faces, landmarks)
embeddings = detector.detect_identity(multi_face_img, faces)

Please post your results in this google sheet along with some information about your system.

We are hoping to get data on a wide variety of computing systems with CPUs and GPUs

ejolly commented 11 months ago

@ljchang I've also added some more detailed profiling instructions here using snakeviz

One immediate optimization in our default AU model (XGB) is that we reload model weights each time we want to do a detection instead of just once during initialization like the SVM model. This should be a pretty quick fix.