Very weird result when inference using NCNN

ShiqiYu / libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Other

12.27k stars 3.05k forks source link

Very weird result when inference using NCNN #335

Closed ithmz closed 2 years ago

ithmz commented 2 years ago

I successfully convert the model to NCNN and write the inference code. However, the conf output of class 0 (non-face?) and class 1 (face?) are very weird. The class score I'm using is on this line The result is class0 score is almost 0.99 and class1 score is 0.01 even in the video there is a face. The model I'm using is face_detection_yunet_2022mar.onnx

fengyuentau commented 2 years ago

@Wwupup Please take a look at this issue.

fengyuentau commented 2 years ago

However, the conf output of class 0 (non-face?) and class 1 (face?) are very weird.

Yes, class 0 is non-face and class 1 is face. What is the weird thing?

The class score I'm using is on this line

To get the final score, you need to do some more calculation as follows: https://github.com/ShiqiYu/libfacedetection/blob/2d02619c44d1877879f6699ca565c1737c17b3f6/src/facedetectcnn.cpp#L766-L776

The result is class0 score is almost 0.99 and class1 score is 0.01 even in the video there is a face

That means the face is not detected. Can you share your example?

The model I'm using is face_detection_yunet_2022mar.onnx

This model should be from opencv zoo, which is an older version than the one used in this repo. We are planning to update the model in opencv zoo.

ithmz commented 2 years ago

That means the face is not detected. Can you share your example?

Do you mean the code or the sample picture? (the picture I get on Google, a picture that contains at least 1 face). The weird thing is not the

However, the conf output of class 0 (non-face?) and class 1 (face?) are very weird.

I mean at first, the class1 score must be high enough right? Then after that, we do the multiplication with iou_score to get the conf score. But the class1_score I obtain is always 0.1. I mostly adapt the code from opencv github for:

fengyuentau commented 2 years ago

I mean at first, the class1 score must be high enough right? Then after that, we do the multiplication with iou_score to get the conf score.

No, the calculation should do first before you judge whether the bbox is a face. We set a threshold for the conf at line 776 to judge whether there is a face in the bbox, and the threshold should be adjusted based on your scene.

ithmz commented 2 years ago

@fengyuentau may I ask what is the preprocess step before feeding the img to the network? Because I still get very low scores and conf. Below is the preprocessing step:

Resize to 160x120 for input model
??? Mean subtraction
??? Normalize

fengyuentau commented 2 years ago

There is no preprocessings for the network. If you are using OpenCV for inference, you can ignore the fixed input shape (OpenCV DNN allocates memory based on the actual input shape). Just make sure the input image has BGR channel order.

By the way, YuNet detects faces of pixel from 10x10 to 300x300. Note that it is the pixels for face not the whole image. It is likely that YuNet fails to detect faces smaller than 10x10 or bigger than 300x300.

ithmz commented 2 years ago

Just make sure the input image has BGR channel order.

This one really critical information since most of the current networks support RGB format LOL

Btw I can success inference using ncnn library

Closed for now, thanks for your help!