This is a webrtc, websockets and opencv experiment developed during a athega hackday.
Frames are captured from the web camera via webrtc and sent to the server over websockets. On the server the frames are processed with opencv and a json response is sent back to the client.
Sample json response:
{
"face": {
"distance": 428.53381034802453,
"coords": {
"y": "39",
"x": "121",
"height": "137",
"width": "137"
},
"name": "mike"
}
}
Everything except distance
is pretty self explanatory.
name
is the predicted name of the person in front of the camera.
coords
is where the face is found in the image.
distance
is a measurement on how accurate the prediction is, lower is better.
If we can't get a reliable prediction (10 consecutive frames that contains a face and with a distance lower than 1000) we switch over to training mode. In training mode we capture 10 images and send them together with a name back to the server for retraining. After the training has been completed we switch back to recognition mode and hopefully we'll get a more accurate result.
Make sure the dependencies are met.
Create the database by issuing the following in the data folder sqlite3 images.db < ../db/create_db.sql
.
Download the AT&T face database and extract it to data/images
before the server is started. This is needed to build the initial prediction model.
cd data
wget http://www.cl.cam.ac.uk/Research/DTG/attarchive/pub/data/att_faces.tar.Z
tar zxvf att_faces.tar.Z
mv att_faces images
Copy haarcascade_frontalface_alt.xml
from <path to opencv source>/data/haarcascades/
to the data folder.
Run with python server.py
and browse to http://localhost:8888 when the model has been trained.