google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.61k stars 5.17k forks source link

When i using pose tracking, predict the same picture multiple times and get different pose landmarks #2359

Closed JunGenius closed 1 year ago

JunGenius commented 3 years ago

Thank you very much for the wonderful work! When I use pose tracking, I predict the same picture multiple times and get different pose landmarks.

   if (p_poller_landmarks->QueueSize() > 0) {
        if (p_poller_landmarks->Next(&packet_landmarks)) {
    auto& output_landmarks = packet_landmarks.Get<mediapipe::NormalizedLandmarkList>();
    for (int i = 0; i < output_landmarks.landmark_size(); ++i)
    {
        const mediapipe::NormalizedLandmark landmark = output_landmarks.landmark(i);
        std::cout << "x:" << landmark.x() << " y:" << landmark.y() << std::endl;
    }
     }
  }

Same picture, two different prediction results.

33 pose landmarks:

x:0.324666 y:0.347682 x:0.419835 y:0.283533 x:0.459635 y:0.291021 x:0.499592 y:0.30032 x:0.326453 y:0.266802 x:0.301516 y:0.262166 x:0.281618 y:0.257848 x:0.578031 y:0.366208 x:0.293469 y:0.29379 x:0.388569 y:0.446807 x:0.276522 y:0.41453 x:0.697199 y:0.840238 x:0.1561 y:0.768919 x:0.780646 y:1.37197 x:0.0953752 y:1.29545 x:0.625278 y:1.76305 x:-0.0248061 y:1.65511 x:0.637033 y:1.88142 x:-0.0890066 y:1.76292 x:0.568265 y:1.87002 x:-0.0822006 y:1.76251 x:0.553731 y:1.82062 x:-0.0421946 y:1.72347 x:0.528559 y:1.7534 x:0.0987969 y:1.72458 x:0.472638 y:2.51319 x:0.0575505 y:2.47445 x:0.45469 y:3.21837 x:0.0629282 y:3.17027 x:0.508538 y:3.35038 x:0.0672814 y:3.29799 x:0.269155 y:3.38339 x:-0.026809 y:3.32363

x:0.344031 y:0.350602 x:0.438186 y:0.283199 x:0.480703 y:0.289549 x:0.521106 y:0.298249 x:0.333024 y:0.266768 x:0.303104 y:0.262766 x:0.281168 y:0.259834 x:0.589903 y:0.361167 x:0.282967 y:0.30058 x:0.414488 y:0.449611 x:0.285201 y:0.418998 x:0.704223 y:0.854637 x:0.142287 y:0.787084 x:0.781208 y:1.40464 x:0.0557646 y:1.31328 x:0.667713 y:1.81893 x:-0.047729 y:1.66862 x:0.698233 y:1.9443 x:-0.110948 y:1.77626 x:0.619825 y:1.94222 x:-0.102178 y:1.78136 x:0.598549 y:1.89065 x:-0.0633313 y:1.74241 x:0.515913 y:1.79254 x:0.0761935 y:1.75831 x:0.455067 y:2.55794 x:0.0268911 y:2.51387 x:0.423811 y:3.25711 x:0.0214741 y:3.19766 x:0.474497 y:3.38978 x:0.0232446 y:3.32294 x:0.23447 y:3.42267 x:-0.0672187 y:3.3579

Any idea of what I could have done wrong? Thanks.

JunGenius commented 3 years ago

Hi @JunGenius, Could you please provide the steps to reproduce the above output. It would be helpful to investigate further if you can provide the code changes. Thanks!

Hi @sgowroji .

        #include <cstdlib>
        #include "absl/flags/flag.h"
        #include "absl/flags/parse.h"
        #include "mediapipe/framework/calculator_framework.h"
        #include "mediapipe/framework/formats/image_frame.h"
        #include "mediapipe/framework/formats/image_frame_opencv.h"
        #include "mediapipe/framework/port/file_helpers.h"
        #include "mediapipe/framework/port/opencv_highgui_inc.h"
        #include "mediapipe/framework/port/opencv_imgproc_inc.h"
        #include "mediapipe/framework/port/opencv_video_inc.h"
        #include "mediapipe/framework/port/parse_text_proto.h"
        #include "mediapipe/framework/port/status.h"

        #include "mediapipe/framework/formats/detection.pb.h"
        #include "mediapipe/framework/formats/landmark.pb.h"
        #include "mediapipe/framework/formats/rect.pb.h"

        const char* kInputStream = "input_video";
        const char* kOutputStream = "output_video";
        const char* kWindowName = "MediaPipe";
        const char* kOutputLandmarks = "pose_landmarks";

        mediapipe::CalculatorGraph m_graph;
        std::unique_ptr<mediapipe::OutputStreamPoller> p_poller;
        std::unique_ptr<mediapipe::OutputStreamPoller> p_poller_landmarks;

        absl::Status InitGraph(const char* model_path) {
            std::string calculator_graph_config_contents;
            MP_RETURN_IF_ERROR(mediapipe::file::GetContents(model_path, &calculator_graph_config_contents));
            mediapipe::CalculatorGraphConfig config =
                mediapipe::ParseTextProtoOrDie<mediapipe::CalculatorGraphConfig>(
                    calculator_graph_config_contents);

            MP_RETURN_IF_ERROR(m_graph.Initialize(config));

            auto sop = m_graph.AddOutputStreamPoller(kOutputStream);
            assert(sop.ok());
            p_poller = std::make_unique<mediapipe::OutputStreamPoller>(std::move(sop.value()));

            mediapipe::StatusOrPoller sop_landmark = m_graph.AddOutputStreamPoller(kOutputLandmarks);
            assert(sop_landmark.ok());
            p_poller_landmarks = std::make_unique<mediapipe::OutputStreamPoller>(std::move(sop_landmark.value()));

            MP_RETURN_IF_ERROR(m_graph.StartRun({}));

            return absl::OkStatus();
        }

        absl::Status RunMPPGraphImage(const char* image_path) {

            size_t frame_timestamp_us_start =
                (double)cv::getTickCount() / (double)cv::getTickFrequency() * 1e6;

            cv::Mat image = cv::imread(image_path);
            RET_CHECK(!image.empty());

            cv::Mat camera_frame;
            cv::cvtColor(image, camera_frame, cv::COLOR_BGR2RGB);
            cv::flip(camera_frame, camera_frame, /*flipcode=HORIZONTAL*/ 1);

            // Wrap Mat into an ImageFrame.
            auto input_frame = absl::make_unique<mediapipe::ImageFrame>(
                                   mediapipe::ImageFormat::SRGB, camera_frame.cols, camera_frame.rows,
                                   mediapipe::ImageFrame::kDefaultAlignmentBoundary);
            cv::Mat input_frame_mat = mediapipe::formats::MatView(input_frame.get());
            camera_frame.copyTo(input_frame_mat);

            // Send image packet into the graph.
            size_t frame_timestamp_us =
                (double)cv::getTickCount() / (double)cv::getTickFrequency() * 1e6;

            MP_RETURN_IF_ERROR(m_graph.AddPacketToInputStream(
                                   kInputStream, mediapipe::Adopt(input_frame.release())
                                   .At(mediapipe::Timestamp(frame_timestamp_us))));

            // Get the graph result packet, or stop if that fails.
            mediapipe::Packet packet;
            mediapipe::Packet packet_landmarks;

            RET_CHECK(p_poller->Next(&packet));
            if (p_poller_landmarks->QueueSize() > 0) {
                if (p_poller_landmarks->Next(&packet_landmarks)) {
                    auto& output_landmarks = packet_landmarks.Get<mediapipe::NormalizedLandmarkList>();
                    for (int i = 0; i < output_landmarks.landmark_size(); ++i)
                    {
                        const mediapipe::NormalizedLandmark landmark = output_landmarks.landmark(i);
                        std::cout << "x:" << landmark.x() << " y:" << landmark.y() << std::endl;
                    }
                }
            }
            return absl::OkStatus();
        }

        absl::Status ReleaseGraph() {
            MP_RETURN_IF_ERROR(m_graph.CloseInputStream(kInputStream));
            MP_RETURN_IF_ERROR(m_graph.CloseInputStream(kOutputLandmarks));
            return m_graph.WaitUntilDone();
        }

        int main(int argc, char** argv) {

            absl::Status run_status = InitGraph("model/pose_tracking_cpu.pbtxt");
            if (!run_status.ok()) {
                return EXIT_FAILURE;
            }

            for (int i = 0; i < 2; i++)
            {
                run_status = RunMPPGraphImage("image/1.jpg");
                if (!run_status.ok()) {
                    return EXIT_FAILURE;
                }
                std::cout << "======================" << std::endl;
            }

            ReleaseGraph();

            system("pause");

            return EXIT_SUCCESS;
        }

result:

  x:0.321211 y:0.343874
  x:0.422228 y:0.284329
  x:0.465305 y:0.293835
  x:0.50425 y:0.304102
  x:0.322694 y:0.261955
  x:0.293819 y:0.25417
  x:0.267306 y:0.248506
  x:0.577594 y:0.367131
  x:0.283517 y:0.29394
  x:0.389845 y:0.449747
  x:0.266147 y:0.416704
  x:0.698397 y:0.855428
  x:0.152487 y:0.797838
  x:0.79136 y:1.41509
  x:0.0763017 y:1.32081
  x:0.654148 y:1.8378
  x:-0.0566305 y:1.69091
  x:0.704054 y:1.99552
  x:-0.12685 y:1.80657
  x:0.619222 y:1.98658
  x:-0.122195 y:1.81059
  x:0.598778 y:1.93065
  x:-0.0817877 y:1.76993
  x:0.526345 y:1.84237
  x:0.0891398 y:1.81455
  x:0.483619 y:2.63554
  x:0.0423068 y:2.59472
  x:0.46509 y:3.36711
  x:0.050975 y:3.31028
  x:0.519211 y:3.50202
  x:0.0569551 y:3.44012
  x:0.272162 y:3.53662
  x:-0.0510269 y:3.46393
  ======================

  x:0.3328 y:0.347131
  x:0.427847 y:0.284678
  x:0.469708 y:0.293718
  x:0.5098 y:0.303687
  x:0.327688 y:0.265334
  x:0.298491 y:0.259655
  x:0.275265 y:0.254973
  x:0.580848 y:0.366766
  x:0.285497 y:0.294201
  x:0.39902 y:0.449754
  x:0.275687 y:0.417091
  x:0.698965 y:0.842751
  x:0.157377 y:0.781545
  x:0.782382 y:1.39585
  x:0.104984 y:1.31849
  x:0.626078 y:1.80938
  x:-0.0471152 y:1.68696
  x:0.643024 y:1.93399
  x:-0.118849 y:1.79853
  x:0.570998 y:1.92297
  x:-0.118887 y:1.79943
  x:0.555329 y:1.87215
  x:-0.0768779 y:1.76014
  x:0.527347 y:1.79286
  x:0.0939123 y:1.76423
  x:0.482345 y:2.57987
  x:0.0493458 y:2.53717
  x:0.465152 y:3.30831
  x:0.0638444 y:3.25924
  x:0.519758 y:3.44113
  x:0.0691193 y:3.38951
  x:0.277347 y:3.4799
  x:-0.025865 y:3.41448
  ======================

Thanks.

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 3 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No

kuaashish commented 1 year ago

Hello @JunGenius, We are upgrading the MediaPipe Legacy Solutions to new MediaPipe solutions However, the libraries, documentation, and source code for all the MediapPipe Legacy Solutions will continue to be available in our GitHub repository and through library distribution services, such as Maven and NPM.

You can continue to use those legacy solutions in your applications if you choose. Though, we would request you to check new MediaPipe solutions which can help you more easily build and customize ML solutions for your applications. These new solutions will provide a superset of capabilities available in the legacy solutions.

google-ml-butler[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 year ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 year ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No