Consider newer SSD Mobilenet model versions

jimmyadaro commented 4 years ago

I see the current version only allows ssdMobilenetv1 but seems like v2 is already available.

How hard is to get to use the newer version instead? Also, what's needed to make it work on this project as a new model?

Edit: I've seen some v3 (here) but I don't know how stable it is.

justadudewhohacks commented 4 years ago

In the course of the last couple of months I have been working hard on getting better and faster face detectors to face-api.js. SSD in fact is not the best method for face detection anymore, FPNs (feature pyramid networks) are achieving state of the art results nowadays.

I am trying out different backbones for that purpose, mobilenetv3 is one of them. I have not decided yet, which backbone to use, still have to do some evaluation.

I do not want to decide on any ETA yet, but the latest models I can come up with are much more lightweight and already achieve higher accuracy with way less parameters than the SSDMobilenetV1 currently provided by face-api.js. I want to make sure, that the new models are as close to state of the art performance, while being as small and lightweight as possible.

So the new model (maybe models) I will be releasing are going to deprecate the currently provided face detection models. That's one of the reasons I do not want do rush things.

jimmyadaro commented 4 years ago

That's awesome, thanks for your response!

Infinitay commented 4 years ago

Any updates on this? No rush of course, just curious.

jimmyadaro commented 4 years ago

This was tagged with "solution provided" but still no news. Anything we can help with @justadudewhohacks ?

no-1ne commented 4 years ago

https://github.com/tensorflow/tfjs-models/tree/master/blazeface is also another alternative

adityapatadia commented 4 years ago

One reason to add MobilenetV2 is that a face detection model trained on Openimage V4 is available here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

It can increase accuracy of face detection due to larger dataset.

adityapatadia commented 4 years ago

@justadudewhohacks I have said above because I have experience that models trained on OpenImage V4 have much better accuracy and are more robust. The only hurdle to their use is the lack of support of MobileNetV2 by this library.

adityapatadia commented 4 years ago

The model I am talking about is this one: http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz

justadudewhohacks commented 4 years ago

Any updates on this? No rush of course, just curious.

Still working on this, I finally got the first versions of some models, that I am quite satisfied with. But still a lot of work has to be done, since I want these models not only to detect faces but also regress 5 point facial landmarks at the same time for face alignment.

Unfortunately training such a model until convergence takes more than one to two weeks, which does slow down the entire process.

justadudewhohacks commented 4 years ago

https://github.com/tensorflow/tfjs-models/tree/master/blazeface is also another alternative

@startupgurukul I recently saw this one yes, seems quite good on the first looks, but it was running very laggy on my phone and I couldn't find any detailed evaluation about mAP / AP compared to state of the art methods.

justadudewhohacks commented 4 years ago

The model I am talking about is this one: http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz

This one is 125mb in size, so very unlikely that someone will use this in a web application.

Infinitay commented 4 years ago

The model I am talking about is this one: http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz

This one is 125mb in size, so very unlikely that someone will use this in a web application.

Out of curiosity if we wanted to use such a model, how could we go about doing so? I'm not well versed in ML and facial recognition but with the help of your API I have gotten by and learned a bit by doing some other research where appropriate. Currently, I am just testing here-and-about and focused on accuracy of the facial recognitions, primarily working with a primary dominant Asian facial data set.

I was reading some other issues and read how, if I'm not mistaken and remembering correctly, you need to quantize the respective model, for example a NasNet or mobilenet_v3 model, in order to get the shards to use with face-api. That being said, I came across your inflatable-unicorns repo that has a project to quantize. Although, there aren't provided instructions and I'm having difficulty running it as a result and there is an error being thrown.

I never posted an issue in the other repo because I wasn't sure if you were providing guidance on it since you decided to move it from this repo to another on it's own. However, if you're willing to assist me, I'll create the issue.

no-1ne commented 4 years ago

Give blazeface a try, now it has an option to just detect have without wasting CPU cycles on landmarks.

PS: with eyes closed the landmarks model isn't doing great for Indians faces, Google may soon release a facemesh model soon, https://sites.google.com/view/perception-cv4arvr/facemesh unnamed

justadudewhohacks / face-api.js

Consider newer SSD Mobilenet model versions #487