Closed kylemcdonald closed 6 years ago
Hi Kyle,
I am not sure if it is trivial to port the dlib 5 point face landmark model to tfjs. As far as I know it is not CNN based (correct me if I am wrong). But in theory I think it should be enough to come up with a simple CNN for 5 point face landmarks, like the 68 point face landmark CNN, just with less conv layers/ conv params, to reduce the overall size.
Regarding the quantization, I am already working on it. The quantized 68 point landmark model is 7MB. The face detection and recognition weights will be ~5MB each. For running the full pipeline, this will be ~17MB.
If everything works out, I will get this done today.
Sorry, you're right about the 5 point detector being non-CNN. I missed this.
I supposed in theory it would be possibly to prune the last layer of the 68-landmark model to only produce 5 outputs instead of 68, but without knowing the exact architecture of the 68-landmark model it's hard to say how much that would actually save.
Super exciting to have the models be smaller! I have something I'm building now that relies on that. I'm going to close this issue since it was sort of misguided :)
It would be great to port the dlib 5-point landmark detector to this framework.
https://github.com/davisking/dlib-models/blob/master/shape_predictor_5_face_landmarks.dat.bz2
Right now, to perform the entire recognition pipeline on a new photo, it requires:
Or around 73MB total, making it a bit unwieldy for most public-facing applications.
The dlib 5-point landmark net is only 5MB, and just as useful for alignment as the 68-point landmark net. This would bring the total from 73MB down to 49MB. In theory, it might be possible to use quantization too as described in #11 bringing the overall size closer to 12MB.