Question about resolution settings causing inaccuracy

pntt3011 / mediapipe_face_iris_cpp

Real-time Face and Iris Landmarks Detection using C++

GNU General Public License v3.0

81 stars 15 forks source link

Question about resolution settings causing inaccuracy #8

Open Alvazz opened 2 years ago

Alvazz commented 2 years ago

I tried to change the resolution to 1280*720 and found that it was no longer accurate. Is it because of the model? If yes, does it mean that I need to retrain the model?

In addition, there will be jitter, even if I stay in front of the camera motionless. Is there any good solution for this. so many Thanks!

pntt3011 commented 2 years ago

Can you show me the code you use to change the resolution?

Alvazz commented 2 years ago

my::IrisLandmark irisLandmarker("./models");
    cv::VideoCapture cap(0);
    cap.set(cv::CAP_PROP_FRAME_WIDTH, 1920);
    cap.set(cv::CAP_PROP_FRAME_HEIGHT, 1080);

    bool success = cap.isOpened();
    if (success == false)
    {
        std::cerr << "Cannot open the camera." << std::endl;
        return 1;
    }

    #if SHOW_FPS
        float sum = 0;
        int count = 0;
    #endif
...

I just used the basic operations of opencv

pntt3011 commented 2 years ago

I'm sorry that I cannot help you immediately because it's midnight in my country. I'll see about it tomorrow. But i can answer some of your questions right now.

You don't have to retrain the models. The input image will always be resized in the preprocess, so its size does not matter. Maybe the problem lies in the post process.
About the jitter, there is one part of my code that is not correct, but it is still sufficient to detect the landmarks. The generateAnchors does not generate anchor at different scale and ratio. You can see https://github.com/AdrianPeniak/mediapipe_face_iris_cpp/blob/main/src/DetectionPostProcess.cpp for more information (i'm not sure if this works).
I wonder that opencv can capture your cam with higher resolution than your cam's capability. Do the frames captured have size of 1920x1080? Can you try resizing it instead?

Alvazz commented 2 years ago

I'm sorry to disturb your rest, but I am very happy to see your answer. You are a kind-hearted person.

My camera supports 1920x1080 resolution, so opencv should be able to get the resolution it supports correctly. The problem of jitter seems to be a relatively big problem, which may be related to the problem of model training. But I have tried to run your exe program, and it does not have a high cpu usage rate. This is the only thing to be thankful for, and I would also like to know how to do this. Looking forward to your next reply.

GN.

pntt3011 commented 2 years ago

Hello,

I tried resizing the captured frame to 1920x1080 before loading to the model and it still worked as expected. Sorry but I cannot reproduce your problem.
I replaced my generateAnchors with the forked ver I mentioned but the result did not improve. I googled about the python API of mediapipe and found this one https://www.youtube.com/watch?v=llh519i9qKU. It seems that this is the model' tradeoff between speed and accuracy. AFAIK, in object detection, there is a technique named DeepSort to solve this problem by tracking and refined the output bounding boxes. It is very complicated and written in Python (obviously) so porting it to C++ is not easily. Furthermore, it just adjusts the bounding boxes, not the individual landmarks.