Open lucker26 opened 10 months ago
Hi @lucker26,
Kindly provide more details regarding your inquiry. Additionally, we suggest filling out the template to ensure we have the necessary information for a more effective response to your support needs.
Thank you
The following is my custom code. I found that I can only output the coordinates of key points. If I want to output the confidence while outputting the coordinates of each key point, how can I modify my code or need to call other classes? import cv2 import mediapipe as mp import csv import h5py from datetime import datetime
mp_hands = mp.solutions.hands hands = mp_hands.Hands(static_image_mode=False, max_num_hands=4, min_detection_confidence=0.5, min_tracking_confidence=0.5)
mpDraw = mp.solutions.drawing_utils
def process_frame(img, csv_writer): img = cv2.flip(img, 1) img_RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) results = hands.process(img_RGB) if results.multi_hand_landmarks: for hand_idx in range(len(results.multi_hand_landmarks)): hand_21 = results.multi_hand_landmarks[hand_idx] mpDraw.draw_landmarks(img, hand_21, mp_hands.HAND_CONNECTIONS, mp.solutions.drawing_styles.get_default_hand_landmarks_style(), mp.solutions.drawing_styles.get_default_hand_connections_style()) if csv_writer is not None: row = [datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")] for lm in hand_21.landmark: confidence = lm.presence row += [lm.ximg.shape[1], lm.yimg.shape[0]] print(confidence) csv_writer.writerow(row)
return img
csv_file = open('hand_poses.csv', mode='w', newline='') csv_writer = csv.writer(csv_file) headers = ['timestamp']
for i in range(21): headers += [f'x{i}', f'y{i}'] csv_writer.writerow(headers)
cv2.VideoCapture(-1).release() cap = cv2.VideoCapture(0) cap.open(0)
while cap.isOpened(): success, frame = cap.read() if not success: print('Error') break frame = process_frame(frame, csv_writer) cv2.imshow('my_window', frame) if cv2.waitKey(1) in [ord('q'), 27]: break
csv_file.close() cap.release() cv2.destroyAllWindows()
For the Hands solution, there is a concept of "presence" and "visibility" for landmarks. According to the MediaPipe Hands documentation :
for the , "presence" indicates the likelihood of the landmark being present in the image, while "visibility" indicates the likelihood of the landmark being visible (not occluded) in the image. These can act as confidence scores for individual landmarks.
not sure if it will work though, as this is also still active: https://github.com/google/mediapipe/issues/3159
anyway, you can modify your code to include these confidence values for each landmark as follows:
Add headers for the presence and visibility confidence scores for each landmark to your CSV. Extract the presence and visibility values for each landmark and include them in the row you write to the CSV. Here's how you can modify your process_frame function and the headers for the CSV:
import cv2
import mediapipe as mp
import csv
from datetime import datetime
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(static_image_mode=False,
max_num_hands=4,
min_detection_confidence=0.5,
min_tracking_confidence=0.5)
mpDraw = mp.solutions.drawing_utils
def process_frame(img, csv_writer):
img = cv2.flip(img, 1)
img_RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
results = hands.process(img_RGB)
if results.multi_hand_landmarks:
for hand_idx, hand_landmarks in enumerate(results.multi_hand_landmarks):
mpDraw.draw_landmarks(img, hand_landmarks, mp_hands.HAND_CONNECTIONS,
mp.solutions.drawing_styles.get_default_hand_landmarks_style(),
mp.solutions.drawing_styles.get_default_hand_connections_style())
if csv_writer is not None:
row = [datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")]
for lm_idx, lm in enumerate(hand_landmarks.landmark):
# Include 'presence' and 'visibility' for each landmark
row.extend([lm.x * img.shape[1], lm.y * img.shape[0], lm.z * img.shape[2], lm.presence, lm.visibility])
csv_writer.writerow(row)
return img
# Open CSV file for writing
csv_file = open('hand_poses.csv', mode='w', newline='')
csv_writer = csv.writer(csv_file)
# Write headers for CSV file
headers = ['timestamp']
for i in range(21):
headers.extend([f'x_{i}', f'y_{i}', f'z_{i}', f'presence_{i}', f'visibility_{i}'])
csv_writer.writerow(headers)
# Initialize video capture
cap = cv2.VideoCapture(0)
while cap.isOpened():
success, frame = cap.read()
if not success:
print('Error')
break
frame = process_frame(frame, csv_writer)
cv2.imshow('my_window', frame)
if cv2.waitKey(1) in [ord('q'), 27]: # if 'q' or ESC is pressed, exit
break
# Cleanup
csv_file.close()
cap.release()
cv2.destroyAllWindows()
Thank you for your suggestion, but presence should mean the probability of the existence of hands, not the probability of the existence of each key point, and the actual output of presence and visibility is 0. Why?
This template is for miscellaneous issues not covered by the other issue categories
For questions on how to work with MediaPipe, or support for problems that are not verified bugs in MediaPipe, please go to StackOverflow and Slack communities.
If you are reporting a vulnerability, please use the dedicated reporting process.