frankier / skelshop

📺 📰 🧑‍💼 Toolkit for skeleton & face analysis of talking heads (e.g. news) videos 🧑‍💼 📰 📺
https://frankier.github.io/skelshop
MIT License
5 stars 1 forks source link

Face-Recognition: cnn_face_detector returns empty lists #18

Closed cstenkamp closed 3 years ago

cstenkamp commented 3 years ago

When running the face-command, I noticed that the cnn_face_detector at https://github.com/frankier/skelshop/blob/master/skelshop/face/pipe.py#L76 takes quite some time on my CPU, but always returns empty lists. If I however import face_recognition.face_locations and run face_recognition.face_locations(frames[0]) in the same position it works just fine and returns me face-locations (..and runs a lot quicker)

frankier commented 3 years ago

I've just pushed a fix. The problem was that the frames were BGR rather than RGB, which the CNN detector seems to be more sensitive to.

We could use the HOG detector rather than the CNN detector for faster CPU detection. This would work well as an optional argument to the face command. (I would like to avoid automatically choosing based on CPU/GPU availability as face_detection does. People should have to opt-on to a fast/low quality detector.)

However I've moved on to using the 68 keypoint skeleton from OpenPose to get face locations. This way our faces are already aligned with skeletons from the beginning.

I am currently working so that it is possible to get face detections using only the face keypoints in BODY_25, so that we don't have to run the (slow) OpenPose face detector. Using either this or the OpenPose keypoints is the prefered way of getting face embeddings rather than dlib's face detectors. See https://github.com/frankier/skelshop/issues/15

cstenkamp commented 3 years ago

huh, interesting, I also thought that that may be the reason and tested it with a few frames after converting to RGB and that didn't help. Maybe these frames just happened to not contain any faces. What you're talking about here is behind the --from-skels argument of the face-command, right?

frankier commented 3 years ago

That is a bit odd, but if it's a random video that has some front matter usually you will be waiting quite a while with a CPU before you get to a frame that has faces.

Yes exactly. Currently it only works with dumps that are BODY_25_ALL i.e body + hands + face. So it should work with the Ellen videos if you managed import them from the zip from Peter using skelshop conv single-zip foo.zip foo.h5 https://frankier.github.io/skelshop/cli/

Just in case you have access to the Case Midwestern computing resources then you can use the ones in my "gallina home" but you have to set the environment variable LEGACY_SKELS=1 since it uses the old 135 point keypoints BODY_25_ALL rather than the new 138 keypoints.

It should be a fairly significant speedup using these already dumped ones.

frankier commented 3 years ago

Closing as concluded. Feel free to reopen.