pntt3011 / mediapipe_face_iris_cpp

Real-time Face and Iris Landmarks Detection using C++
GNU General Public License v3.0
80 stars 15 forks source link

About the ROI #18

Open liangzhenghao2000 opened 2 years ago

liangzhenghao2000 commented 2 years ago

The landmarks detected in this project are sometimes inaccurate and jittery.After reading the code, I think it is because the ROI here is inconsistent with the ROI croped by the original processing method of mediapipe.In addition to find the face area, I think it is necessary to do affine transformation to the area before feed it to landmark model.Also, mediapipe uses the landmark-refine model, is it possible to implement it in this project?

pntt3011 commented 2 years ago

Hi @liangzhenghao2000,

I appreciate your opinion about the ROI cropped. According to the face detection graph, I think that the affine transform at the end is to project the detections from [0, 1] to the origin size. We should dig into their implementation to find out what they really do.

Another thing I just found is that they resize the input images but keeping the aspect ratio and pad the shorter size with zero while I resize them directly.

About the landmark attention model, it is also mentioned in #12 that we must customize the tensorflow lite lib and recompile it.

However, I did this project last year for my hobby and didn't expect it to get this much attention. I'm sorry that I cannot help you further.

liangzhenghao2000 commented 2 years ago

Thank you for your reply! If I have time I will go through their papers to confirm these.