google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.91k stars 5.1k forks source link

Trying get landmarks from face_mesh desktop, but programm stop working #5658

Open SergeyGalaxyOrsik opened 2 hours ago

SergeyGalaxyOrsik commented 2 hours ago

OS Platform and Distribution

MacOS Sonoma 14.3.1

Compiler version

Apple clang version 15.0.0 (clang-1500.3.9.4)

Programming Language and version

C++

Installed using virtualenv? pip? Conda?(if python)

No response

MediaPipe version

No response

Bazel version

No response

XCode and Tulsi versions (if iOS)

No response

Android SDK and NDK versions (if android)

No response

Android AAR (if android)

None

OpenCV version (if running on desktop)

3.4.20

Describe the problem

I try to get landmarks from face_mesh by multi_face_landmarks, but after start programm it stopped in this momet if (!poller_detection.Next(&detection_packet))

Complete Logs

I0000 00:00:1727634617.180144  365189 demo_run_graph_main_blendshape.cc:60] Initialize the calculator graph.
I0000 00:00:1727634617.196735  365189 demo_run_graph_main_blendshape.cc:64] Initialize the camera or load the video.
I0000 00:00:1727634618.909277  365189 demo_run_graph_main_blendshape.cc:89] Start running the calculator graph.
I0000 00:00:1727634618.910793  365189 demo_run_graph_main_blendshape.cc:96] Start grabbing and processing frames.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: XNNPack weight cache not enabled.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 164 out of 164 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for the whole graph.
W0000 00:00:1727634618.924008  365219 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
VERBOSE: XNNPack weight cache not enabled.
VERBOSE: Replacing 700 out of 712 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 5 partitions for the whole graph.
W0000 00:00:1727634619.065230  365211 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1727634619.067072  365189 demo_run_graph_main_blendshape.cc:141] Waiting....
W0000 00:00:1727634619.067088  365189 demo_run_graph_main_blendshape.cc:143] detection_packet....

my source code

// Copyright 2019 The MediaPipe Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// An example of sending OpenCV webcam frames into a MediaPipe graph.
#include <cstdlib>

#include "absl/flags/flag.h"
#include "absl/flags/parse.h"
#include "absl/log/absl_log.h"
#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/image_frame.h"
#include "mediapipe/framework/formats/image_frame_opencv.h"
#include "mediapipe/framework/port/file_helpers.h"
#include "mediapipe/framework/port/opencv_highgui_inc.h"
#include "mediapipe/framework/port/opencv_imgproc_inc.h"
#include "mediapipe/framework/port/opencv_video_inc.h"
#include "mediapipe/framework/port/parse_text_proto.h"
#include "mediapipe/framework/port/status.h"
#include "mediapipe/util/resource_util.h"
#include "mediapipe/calculators/util/landmarks_to_render_data_calculator.pb.h"
#include "mediapipe/framework/formats/landmark.pb.h"

constexpr char kInputStream[] = "input_video";
constexpr char kOutputStream[] = "output_video";
constexpr char kWindowName[] = "MediaPipe";
constexpr char kDetectionsStream[] = "multi_face_landmarks";

ABSL_FLAG(std::string, calculator_graph_config_file, "",
          "Name of file containing text format CalculatorGraphConfig proto.");
ABSL_FLAG(std::string, input_video_path, "",
          "Full path of video to load. "
          "If not provided, attempt to use a webcam.");
ABSL_FLAG(std::string, output_video_path, "",
          "Full path of where to save result (.mp4 only). "
          "If not provided, show result in a window.");

absl::Status RunMPPGraph()
{
    std::string calculator_graph_config_contents;
    MP_RETURN_IF_ERROR(mediapipe::file::GetContents(
        absl::GetFlag(FLAGS_calculator_graph_config_file),
        &calculator_graph_config_contents));
    ABSL_LOG(INFO) << "Get calculator graph config contents: "
                   << calculator_graph_config_contents;
    mediapipe::CalculatorGraphConfig config =
        mediapipe::ParseTextProtoOrDie<mediapipe::CalculatorGraphConfig>(
            calculator_graph_config_contents);

    ABSL_LOG(INFO) << "Initialize the calculator graph.";
    mediapipe::CalculatorGraph graph;
    MP_RETURN_IF_ERROR(graph.Initialize(config));

    ABSL_LOG(INFO) << "Initialize the camera or load the video.";
    cv::VideoCapture capture;
    const bool load_video = !absl::GetFlag(FLAGS_input_video_path).empty();
    if (load_video)
    {
        capture.open(absl::GetFlag(FLAGS_input_video_path));
    }
    else
    {
        capture.open(0);
    }
    RET_CHECK(capture.isOpened());

    cv::VideoWriter writer;
    const bool save_video = !absl::GetFlag(FLAGS_output_video_path).empty();
    if (!save_video)
    {
        cv::namedWindow(kWindowName, /*flags=WINDOW_AUTOSIZE*/ 1);
#if (CV_MAJOR_VERSION >= 3) && (CV_MINOR_VERSION >= 2)
        capture.set(cv::CAP_PROP_FRAME_WIDTH, 640);
        capture.set(cv::CAP_PROP_FRAME_HEIGHT, 480);
        capture.set(cv::CAP_PROP_FPS, 30);
#endif
    }

    ABSL_LOG(INFO) << "Start running the calculator graph.";
    MP_ASSIGN_OR_RETURN(mediapipe::OutputStreamPoller poller,
                        graph.AddOutputStreamPoller(kOutputStream));
    MP_ASSIGN_OR_RETURN(mediapipe::OutputStreamPoller poller_detection,
                        graph.AddOutputStreamPoller("multi_face_landmarks"));
    MP_RETURN_IF_ERROR(graph.StartRun({}));

    ABSL_LOG(INFO) << "Start grabbing and processing frames.";
    bool grab_frames = true;
    while (grab_frames)
    {
        // Capture opencv camera or video frame.
        cv::Mat camera_frame_raw;
        capture >> camera_frame_raw;
        if (camera_frame_raw.empty())
        {
            if (!load_video)
            {
                ABSL_LOG(INFO) << "Ignore empty frames from camera.";
                continue;
            }
            ABSL_LOG(INFO) << "Empty frame, end of video reached.";
            break;
        }
        cv::Mat camera_frame;
        cv::cvtColor(camera_frame_raw, camera_frame, cv::COLOR_BGR2RGB);
        if (!load_video)
        {
            cv::flip(camera_frame, camera_frame, /*flipcode=HORIZONTAL*/ 1);
        }

        // Wrap Mat into an ImageFrame.
        auto input_frame = absl::make_unique<mediapipe::ImageFrame>(
            mediapipe::ImageFormat::SRGB, camera_frame.cols, camera_frame.rows,
            mediapipe::ImageFrame::kDefaultAlignmentBoundary);
        cv::Mat input_frame_mat = mediapipe::formats::MatView(input_frame.get());
        camera_frame.copyTo(input_frame_mat);

        // Send image packet into the graph.
        size_t frame_timestamp_us =
            (double)cv::getTickCount() / (double)cv::getTickFrequency() * 1e6;
        MP_RETURN_IF_ERROR(graph.AddPacketToInputStream(
            kInputStream, mediapipe::Adopt(input_frame.release())
                              .At(mediapipe::Timestamp(frame_timestamp_us))));

        // Get the graph result packet, or stop if that fails.
        mediapipe::Packet packet;
        if (!poller.Next(&packet))
            break;
        auto &output_frame = packet.Get<mediapipe::ImageFrame>();

        // MY CODE MADE BY ORSIK SERGEY
        ABSL_LOG(WARNING) << "Waiting....";
        mediapipe::Packet detection_packet;
        ABSL_LOG(WARNING) << "detection_packet....";
        if (!poller_detection.Next(&detection_packet)) {
            ABSL_LOG(WARNING) << "No face landmarks in the packet.";
            continue;
        }
        ABSL_LOG(WARNING) << "output_landmarks....";
        auto &output_landmarks = detection_packet.Get<std::vector<::mediapipe::NormalizedLandmarkList>>();
        for (const ::mediapipe::NormalizedLandmarkList &normalizedlandmarkList : output_landmarks)
        {
            std::cout << "FaceLandmarks:";
            std::cout << normalizedlandmarkList.DebugString();
        }
        //-----------------------------

        // Convert back to opencv for display or saving.
        cv::Mat output_frame_mat = mediapipe::formats::MatView(&output_frame);
        cv::cvtColor(output_frame_mat, output_frame_mat, cv::COLOR_RGB2BGR);
        if (save_video)
        {
            if (!writer.isOpened())
            {
                ABSL_LOG(INFO) << "Prepare video writer.";
                writer.open(absl::GetFlag(FLAGS_output_video_path),
                            mediapipe::fourcc('a', 'v', 'c', '1'), // .mp4
                            capture.get(cv::CAP_PROP_FPS), output_frame_mat.size());
                RET_CHECK(writer.isOpened());
            }
            writer.write(output_frame_mat);
        }
        else
        {
            cv::imshow(kWindowName, output_frame_mat);
            // Press any key to exit.
            const int pressed_key = cv::waitKey(5);
            if (pressed_key >= 0 && pressed_key != 255)
                grab_frames = false;
        }
    }

    ABSL_LOG(INFO) << "Shutting down.";
    if (writer.isOpened())
        writer.release();
    MP_RETURN_IF_ERROR(graph.CloseInputStream(kInputStream));
    return graph.WaitUntilDone();
}

int main(int argc, char **argv)
{
    google::InitGoogleLogging(argv[0]);
    absl::ParseCommandLine(argc, argv);
    absl::Status run_status = RunMPPGraph();
    if (!run_status.ok())
    {
        ABSL_LOG(ERROR) << "Failed to run the graph: " << run_status.message();
        return EXIT_FAILURE;
    }
    else
    {
        ABSL_LOG(INFO) << "Success!";
    }
    return EXIT_SUCCESS;
}
SergeyGalaxyOrsik commented 2 hours ago

this is my .pbtxt file

# MediaPipe graph that performs face mesh with TensorFlow Lite on CPU.

# Input image. (ImageFrame)
input_stream: "input_video"

# Output image with rendered results. (ImageFrame)
output_stream: "output_video"
# Collection of detected/processed faces, each represented as a list of
# landmarks. (std::vector<NormalizedLandmarkList>)
output_stream: "multi_face_landmarks"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Defines side packets for further use in the graph.
node {
  calculator: "ConstantSidePacketCalculator"
  output_side_packet: "PACKET:0:num_faces"
  output_side_packet: "PACKET:1:with_attention"
  node_options: {
    [type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
      packet { int_value: 1 }
      packet { bool_value: true }
    }
  }
}

# Subgraph that detects faces and corresponding landmarks.
node {
  calculator: "FaceLandmarkFrontCpu"
  input_stream: "IMAGE:throttled_input_video"
  input_side_packet: "NUM_FACES:num_faces"
  input_side_packet: "WITH_ATTENTION:with_attention"
  output_stream: "LANDMARKS:multi_face_landmarks"
  output_stream: "ROIS_FROM_LANDMARKS:face_rects_from_landmarks"
  output_stream: "DETECTIONS:face_detections"
  output_stream: "ROIS_FROM_DETECTIONS:face_rects_from_detections"
}

# Subgraph that renders face-landmark annotation onto the input image.
node {
  calculator: "FaceRendererCpu"
  input_stream: "IMAGE:throttled_input_video"
  input_stream: "LANDMARKS:multi_face_landmarks"
  input_stream: "NORM_RECTS:face_rects_from_landmarks"
  input_stream: "DETECTIONS:face_detections"
  output_stream: "IMAGE:output_video"
}

my /Users/sglx/Desktop/mediapipe/mediapipe/examples/desktop/BUILD

cc_library(
    name = "demo_run_graph_main",
    srcs = ["demo_run_graph_main.cc"],
    deps = [
        "//mediapipe/framework:calculator_framework",
        "//mediapipe/framework/formats:image_frame",
        "//mediapipe/framework/formats:image_frame_opencv",
        "//mediapipe/framework/port:file_helpers",
        "//mediapipe/framework/port:opencv_highgui",
        "//mediapipe/framework/port:opencv_imgproc",
        "//mediapipe/framework/port:opencv_video",
        "//mediapipe/framework/port:parse_text_proto",
        "//mediapipe/framework/port:status",
        "//mediapipe/util:resource_util",
        "@com_google_absl//absl/flags:flag",
        "@com_google_absl//absl/flags:parse",
        "@com_google_absl//absl/log:absl_log",
        "//mediapipe/calculators/util:landmarks_to_render_data_calculator",
        "//mediapipe/framework/formats:landmark_cc_proto",
    ],
)

my /Users/sglx/Desktop/mediapipe/mediapipe/examples/desktop/face_mesh/BUILD


cc_binary(
    name = "face_mesh_cpu",
    data = ["//mediapipe/modules/face_landmark:face_landmark_with_attention.tflite"],
    deps = [
        "//mediapipe/examples/desktop:demo_run_graph_main",
        "//mediapipe/graphs/face_mesh:desktop_live_calculators",
    ],
)
SergeyGalaxyOrsik commented 2 hours ago

@ayushgdev Do you know why is happen like this?