luxai-qtrobot / QA

Virtual repository for Questions & Answers system
http://luxai-qtrobot.github.io
5 stars 0 forks source link

Nuitrack face detection requires subject to be at least ~2 feet away #16

Open RMichaelSwan opened 4 years ago

RMichaelSwan commented 4 years ago

As mentioned in the heading, we need to be able to detect faces that are closer to QT (say ~1 foot away) in order to simulate face to face communication with a person (you don't usually stand 2+ feet away from people when talking to them). Deep learning based face detectors and even older techniques (Haar Cascade) don't have this limitation. Are there any plans to update /qt_nuitrack_app/faces to be more robust?

apaikan commented 4 years ago

I deleted my previous comments because i wrongly understood that you want to filter out some faces which are further than some specific distance!

coming back to your question, you have access to the robot camera images. you can feed this image stream to any deep-learning or other type of face recognition library and use it with your application For example you may take a look at Intel® Movidius™ Neural Compute SDK which has plenty of examples including face recognition for different neural networks to be used with our without intel computational stick.

RMichaelSwan commented 4 years ago

I am already looking into some deep learning model solutions, I was just hoping that some code or library might be readily available or that we could get some support from the Nuitrack team. One thing to note is that most deep learning models just use 2D RGB data and do not make use of depth (it would be handy if we could identify upper body joints, but joint tracking only seems to work if it can see the torso). There is also the issue of inference time--Nuitrack runs very fast (~80% cpu usage on one core while maintaining 30 FPS at a good resolution) while some of the most efficient face detectors I have found struggle to get more than 5-10 FPS (with max CPU usage on the NUC) at low resolutions on our CPU bound setup.

The Movidius resource (especially the app zoo) is one I wasn't aware of though, I'll try to see how it performs.

andrewpbstout commented 3 years ago

Hi. I'm re-animating this thread just to ask @RMichaelSwan what you settled on, and ask the wider community (if anyone else is listening) what folks are using for face reco on QT.

I just started playing with face detection about half an hour ago. So far I've just done a rostopic echo /qt_nuitrack_app/faces and moved around in front of the robot. I can see that it's working, but detection seems a little spotty. Maybe because I'm in a small room and can only get about 2 feet away from the robot, and the usual distance is more like 18 inches?

I do appreciate that when it does see me it thinks I'm younger than I actually am, but I also seem to have resting emotion_angry face. ;-)

Just wondering if most folks are using the stock face detection from qt_nuitrack, or if there's another package that works well out of the box.

RMichaelSwan commented 3 years ago

@andrewpbstout there are a bunch of neural networks and algorithms for face recognition/detection and emotion detection, though it's definitely a challenge to find ones that run efficiently when CPU bound (as many use GPU). Due to Nuitrack limitations, my group decided to roll our own solution using open source libraries. The one we settled on for face detection (only) was dlib, which runs quite efficiently and doesn't have the distance problem. We also tried FaceNet, though I don't recall the results there (I believe it's a little slower, but more robust).

Our focus was on face detection so that QT's head/eyes could track people, so we haven't around gotten to deploying a facial gesture/emotion detection system yet. We have used OpenFace to process some video in the past. One challenge is that a lot of these solutions do not make use of depth data, which one would assume should be more accurate. I have some hope that Tensorflow3D may encourage development of such solutions in the near future.

In any case, our code is open source, it runs on QTrobot, and we are in the process of publishing a paper to the next ICRA about it. We opted for using Python3 which is a little harder to initially setup on ROS Kinetic, but it works. We also put some Docker containers together which can get you started without much setup. Here's our dlib detector and our codebase (feel free to make use of it/contribute!): https://github.com/interaction-lab/HARMONI

andrewpbstout commented 3 years ago

@RMichaelSwan Thanks for the reply! HARMONI sounds like it could be really useful--I'm eager to check it out in more detail, although I might have limited bandwidth to do so in the short term. (I wish it had existed/I had known about it a year and a half ago!) I'd love to read your paper.

RMichaelSwan commented 3 years ago

Heh, we started development on it about a year and a half ago, so that might have been difficult ;). I'll double check with the other authors, but it should be okay to share the paper at this point.

In the near term, if you want to just have something going quickly, you could probably tear out the detector package we have and shift some code around to work with whatever image source you have, though I would have to make a biased recommendation to adopt HARMONI entirely instead :)