zhouyuchong / face-recognition-deepstream

Deepstream app use retinaface and arcface for face recognition.
MIT License
55 stars 12 forks source link

How to do aligment if we use retinaface on SGIE #18

Open daothanh2011 opened 1 year ago

daothanh2011 commented 1 year ago

Thank you for sharing your code. I would like to use another detection and used it for PGIE. In PGIE, I can detect the face and pass the face detected to SGIE which used retinaface net. However, When I draw the box of retinaface, the bbox and landmark is located in wrong place in the image. Do you know how to solve that problem? My pipeline is PGIE(face, human, car...) -> SGIE (only use face object to pass to retinaface network) -> alignment -> recognition. Thank you in advance !!!

zhouyuchong commented 1 year ago

@daothanh2011 I didn't meet your problems before. Bboxes were fine. However, there are some other problems when I doing the same work on a SGIE as on PGIE, which includes get wrong landmarks of a same single object.

I've been busy before but I'll manage to solve them these days. I'll update my codes once finished.

daothanh2011 commented 1 year ago

Thank you for your response. I finished it. However, I am struggling with the alignment step. All the facial landmarks are stored in user_meta. Can I use this repo to do alignment 'https://github.com/zhouyuchong/gst-nvinfer-custom' Thank you !!!

zhouyuchong commented 1 year ago

@daothanh2011 Yes, you can have a try. But the custom nvinfer you mentioned above can only work on PGIE. The problem is that the landmarks generated by sgie are stored in object user-meta, which causes lots of trouble to retrieve. As I tried before, sometimes I get the wrong data in object user-meta. However, the user-meta generated by PGIE, stored in Frame user-meta is always right. I'll try to fix this recently.

daothanh2011 commented 1 year ago

Thank you for your help. After working with your code. May I ask some information: 1: In the tensor_extractor.cpp file, within the void Extractor::Impl::facelmks function, you called following function to decode the output from retina :

        float *output = (float*)(outputLayersInfo[0].buffer);

        std::vector<FaceInfo> temp;
        decode_bbox_retina_face(temp, output, CONF_THRESH, FACE_NETWIDTH, FACE_NETHEIGHT);
        nms_and_adapt(temp, res, NMS_THRESH, FACE_NETWIDTH, FACE_NETHEIGHT);

However, the *output is pointed to output of arcface net. Since you called facelmks function when alignment = 1 and this configuration was set up in config_arcface.txt. Could you please explain how you handle output of retinaface here? 2: Can we use nvds_add_user_meta_to_frame to solve the problem of object user-meta. Thank you in advance!!!

zhouyuchong commented 1 year ago

Thank you for your help. After working with your code. May I ask some information: 1: In the tensor_extractor.cpp file, within the void Extractor::Impl::facelmks function, you called following function to decode the output from retina :

        float *output = (float*)(outputLayersInfo[0].buffer);

        std::vector<FaceInfo> temp;
        decode_bbox_retina_face(temp, output, CONF_THRESH, FACE_NETWIDTH, FACE_NETHEIGHT);
        nms_and_adapt(temp, res, NMS_THRESH, FACE_NETWIDTH, FACE_NETHEIGHT);

However, the *output is pointed to output of arcface net. Since you called facelmks function when alignment = 1 and this configuration was set up in config_arcface.txt. Could you please explain how you handle output of retinaface here? 2: Can we use nvds_add_user_meta_to_frame to solve the problem of object user-meta. Thank you in advance!!!

I’m terribly sorry that I miss your message for such long time :cry: . Hope you already solved these problems. For anyone looking at this issue, For Q1, these codes are preprocessing of the input into arcface not the output. The 'gst-nvinfer' codes are about the preprocessing and the 'nvdsinfer' are postprocessing. For Q2, of course you can do this. I'm not doing this because there are some problems when I trying to add contents to meta-data using python. I can't retrieve the same data downstream. However, the cpp can exactly change the data. I'm not digging in since busy work.