google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.81k stars 5.18k forks source link

Accessing hand landmarks and other information in new releases #2366

Closed izakharkin closed 3 years ago

izakharkin commented 3 years ago

Please make sure that this is a solution issue.

System information (Please provide as much relevant information as possible)

  • Have I written custom code (as opposed to using a stock example script provided in Mediapipe): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4): Mac OS X 11.4
  • MediaPipe version: v0.8.6
  • Bazel version: 3.7.2
  • Solution (e.g. FaceMesh, Pose, Holistic): Hands
  • Programming Language and version ( e.g. C++, Python, Java): C++

Describe the expected behavior:

I am trying to access real-time information about hand landmarks, scaled hand landmarks, handedness, and hand rects. In one of the earlier issues (https://github.com/google/mediapipe/issues/200), there was a complete guide on how to get these. It worked well until the new version, currently, v0.8.6 introduced some new mechanisms (actually it was v0.8.0 or v0.8.1) of how the graphs and modules interact.

The point is that the approach I used before - modification of the output streams of the hand tracking graphs, creation of the output stream pollers, and retrieval with the packets .Get() method in the main() loop do not work anymore: I get endless waiting of the main loop and it sends no new output frames (becomes idle) when there are no hands in the input frame.

I suppose that the issue is connected to the use of the poller.Next() method and could be somehow resolved with the use of GateCalculators and PreviousLoopbackCalculators. But honestly, it turned out to be quite tricky for me to handle this new runtime logic and I am asking for help and for a piece of advice on how to solve this the most correct and convenient way.

Standalone code you may have used to try to get what you need :

If there is a problem, provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/repo link /any notebook:

My fork with all the changes needed to reproduce the issue: https://github.com/izakharkin/mediapipe. The main file is mediapipe/examples/desktop/hand_tracking/hand_tracking_cpu_main.cc.

Build:

bazel build -c opt --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hand_tracking:hand_tracking_cpu

Run:

GLOG_logtostderr=1 bazel-bin/mediapipe/examples/desktop/hand_tracking/hand_tracking_cpu

Other info / Complete Logs : Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached:


Thank you very much!

sgowroji commented 3 years ago

Hi @izakharkin, Did you face any error while doing it with the latest version. Could you please share the logs to understand the issue with much better detail. Thanks!

izakharkin commented 3 years ago

Hi @sgowroji, thanks for your reply. Yes, I face this error while doing it with the latest version (currently, it is v0.8.6). Actually, there are no logs produced, I am just running the Hands desktop example with my fork (building and running with commands I provided above) and all works fine until I move my hands outside of the input frame, so the loop awaits infinitely on the row with if (!handedness_poller.Next(&handedness_packet)) break;, so it is waiting for the .Next() packet I suppose. The question is how to handle these situations, thank tou!

wongfei commented 3 years ago

before reading the landmark packet, check it's presence, see: https://github.com/google/mediapipe/issues/850#issuecomment-683268033

or use ObserveOutputStream instead of AddOutputStreamPoller

alpc72 commented 3 years ago

I encountered the same problem. In addition, what issues should I pay attention to when acquiring the GPU version of the hand landmark? Thank you very much !

izakharkin commented 3 years ago

@wongfei thanks a lot, the use of the PacketPresenceCalculator helped indeed. I have updated my fork accordingly. Now I when I move the hands out of the input frame, it did not stop processing and outputting frames.

However, there appeared another problem: when I move the hands inside again, it prints out the information (handedness, hand rects, landmarks) with a kind of delay, so I need to wait for some time for this info to catch up with the current real-time state of the hands. Maybe you have ideas on why does it happen? Could it be something wrong with the timestamps?

jiuqiant commented 3 years ago

Here are two solutions you can try:

Solution 1. Try tool::AddMultiStreamCallback with settingobserve_timestamp_bounds to true. Here is the usage example: https://github.com/google/mediapipe/blob/a9b643e0f5978425d948050a2228398929733d05/mediapipe/framework/tool/sink_test.cc#L188 By doing this, when no hands are detected, the callback calculator sends out empty packets.

Solution 2. Patch the following to allow OutputStreamPoller to observe timestamp bounds.

diff --git a/mediapipe/framework/calculator_graph.cc b/mediapipe/framework/calculator_graph.cc
index d95d6d3..6cff78d 100644
--- a/mediapipe/framework/calculator_graph.cc
+++ b/mediapipe/framework/calculator_graph.cc
@@ -465,7 +465,7 @@
 }

 absl::StatusOr<OutputStreamPoller> CalculatorGraph::AddOutputStreamPoller(
-    const std::string& stream_name) {
+    const std::string& stream_name, bool observe_timestamp_bounds) {
   RET_CHECK(initialized_).SetNoLogging()
       << "CalculatorGraph is not initialized.";
   int output_stream_index = validated_graph_->OutputStreamIndex(stream_name);
@@ -479,7 +479,7 @@
       stream_name, &any_packet_type_,
       std::bind(&CalculatorGraph::UpdateThrottledNodes, this,
                 std::placeholders::_1, std::placeholders::_2),
-      &output_stream_managers_[output_stream_index]));
+      &output_stream_managers_[output_stream_index], observe_timestamp_bounds));
   OutputStreamPoller poller(internal_poller);
   graph_output_streams_.push_back(std::move(internal_poller));
   return std::move(poller);
diff --git a/mediapipe/framework/calculator_graph.h b/mediapipe/framework/calculator_graph.h
index fb0bb69..0e6d53b 100644
--- a/mediapipe/framework/calculator_graph.h
+++ b/mediapipe/framework/calculator_graph.h
@@ -164,7 +164,8 @@
   // polling API for accessing a stream's output. Should only be called before
   // Run() or StartRun(). For asynchronous output, use ObserveOutputStream. See
   // also the helpers in tool/sink.h.
-  StatusOrPoller AddOutputStreamPoller(const std::string& stream_name);
+  StatusOrPoller AddOutputStreamPoller(const std::string& stream_name,
+                                       bool observe_timestamp_bounds = false);

   // Gets output side packet by name after the graph is done. However, base
   // packets (generated by PacketGenerators) can be retrieved before
diff --git a/mediapipe/framework/graph_output_stream.cc b/mediapipe/framework/graph_output_stream.cc
index 6639bb8..8235d3c 100644
--- a/mediapipe/framework/graph_output_stream.cc
+++ b/mediapipe/framework/graph_output_stream.cc
@@ -125,9 +125,10 @@
 absl::Status OutputStreamPollerImpl::Initialize(
     const std::string& stream_name, const PacketType* packet_type,
     std::function<void(InputStreamManager*, bool*)> queue_size_callback,
-    OutputStreamManager* output_stream_manager) {
+    OutputStreamManager* output_stream_manager, bool observe_timestamp_bounds) {
   MP_RETURN_IF_ERROR(GraphOutputStream::Initialize(stream_name, packet_type,
-                                                   output_stream_manager));
+                                                   output_stream_manager,
+                                                   observe_timestamp_bounds));
   input_stream_handler_->SetQueueSizeCallbacks(queue_size_callback,
                                                queue_size_callback);
   return absl::OkStatus();
@@ -176,12 +177,16 @@
 bool OutputStreamPollerImpl::Next(Packet* packet) {
   CHECK(packet);
   bool empty_queue = true;
+  bool observed_timestamp_bound_change = false;
   Timestamp min_timestamp = Timestamp::Unset();
   mutex_.Lock();
   while (true) {
     min_timestamp = input_stream_->MinTimestampOrBound(&empty_queue);
+    observed_timestamp_bound_change =
+        input_stream_handler_->ProcessTimestampBounds() &&
+        prev_output_ts_ < min_timestamp.PreviousAllowedInStream();
     if (graph_has_error_ || !empty_queue ||
-        min_timestamp == Timestamp::Done()) {
+        min_timestamp == Timestamp::Done() || observed_timestamp_bound_change) {
       break;
     } else {
       handler_condvar_.Wait(&mutex_);
@@ -191,17 +196,26 @@
     mutex_.Unlock();
     return false;
   }
+  if (empty_queue) {
+    prev_output_ts_ = min_timestamp.PreviousAllowedInStream();
+  } else {
+    prev_output_ts_ = min_timestamp;
+  }
   mutex_.Unlock();
   if (min_timestamp == Timestamp::Done()) {
     return false;
   }
-  int num_packets_dropped = 0;
-  bool stream_is_done = false;
-  *packet = input_stream_->PopPacketAtTimestamp(
-      min_timestamp, &num_packets_dropped, &stream_is_done);
-  CHECK_EQ(num_packets_dropped, 0)
-      << absl::Substitute("Dropped $0 packet(s) on input stream \"$1\".",
-                          num_packets_dropped, input_stream_->Name());
+  if (!empty_queue) {
+    int num_packets_dropped = 0;
+    bool stream_is_done = false;
+    *packet = input_stream_->PopPacketAtTimestamp(
+        min_timestamp, &num_packets_dropped, &stream_is_done);
+    CHECK_EQ(num_packets_dropped, 0)
+        << absl::Substitute("Dropped $0 packet(s) on input stream \"$1\".",
+                            num_packets_dropped, input_stream_->Name());
+  } else if (observed_timestamp_bound_change) {
+    *packet = Packet().At(Timestamp(min_timestamp.PreviousAllowedInStream()));
+  }
   return true;
 }

diff --git a/mediapipe/framework/graph_output_stream.h b/mediapipe/framework/graph_output_stream.h
index 393407a..1110a10 100644
--- a/mediapipe/framework/graph_output_stream.h
+++ b/mediapipe/framework/graph_output_stream.h
@@ -143,7 +143,8 @@
   absl::Status Initialize(
       const std::string& stream_name, const PacketType* packet_type,
       std::function<void(InputStreamManager*, bool*)> queue_size_callback,
-      OutputStreamManager* output_stream_manager);
+      OutputStreamManager* output_stream_manager,
+      bool observe_timestamp_bounds = false);

   void PrepareForRun(std::function<void()> notification_callback,
                      std::function<void(absl::Status)> error_callback) override;
@@ -170,6 +171,7 @@
   absl::Mutex mutex_;
   absl::CondVar handler_condvar_ ABSL_GUARDED_BY(mutex_);
   bool graph_has_error_ ABSL_GUARDED_BY(mutex_);
+  Timestamp prev_output_ts_ ABSL_GUARDED_BY(mutex_) = Timestamp::Min();
 };

 }  // namespace internal

Remember to modify your hand_tracking_cpu_main.cc to have

auto status_or_poller =
      graph.AddOutputStreamPoller("foo", /*observe_timestamp_bounds=*/true);
izakharkin commented 3 years ago

@jiuqiant thanks for your advice. I tried the second solution and it worked (even without the PacketPresenceCalculator), now everything is okay and without any delay. I have updated my fork, so anyone with the same issue is welcome to check it out.

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No