Closed connormeaton closed 3 years ago
Sorry for delay, the 30fps were not including the detector and the use case is very suboptimal.
The first forward pass will be significantly slower as the network will initialize, load the models etc. You should:
a) create the fa
model a single time, then reuse it in the subsequent calls to .get_landmarks()
Example of pseudo-code:
fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._3D, device='cuda', flip_input=False,
face_detector=face_detector, face_detector_kwargs=face_detector_kwargs)
# warmup
_ = preds = fa.get_landmarks(all_images[0])
start = time.time()
for img in all_images:
preds = fa.get_landmarks(img)
end = time.time()
print((end-start)/len(all_images))
Note that time
library is not ideal for measuring this, but it should be sufficiently accurate in this case.
b) Depending on the resolution of your image and the face_detector used, the speed will vary, for example 'sfd' is relatively slow, so you could try using 'blazeface'
Thank you for open sourcing this code. I am interested in the landmark detection. I am using it on an AWS ec2 (p2.xlarge with 8 NVIDIA K80 GPUs). All I'm running is the face detector and landmark function. It looks like this:
When I print the time, I'm getting between 9-12 seconds to predict landmarks on 1 image. I have tried changing the device='' parameter to 'cpu' and 'cuda', but its not going any faster. I see from other issues that you expect to be able to predict landmarks on 30 images per second on GPU. How can I do this?
Thanks.