open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.59k stars 1.22k forks source link

webcam demo inference speed #2179

Closed moonsh closed 1 year ago

moonsh commented 1 year ago

I tried the webcam demo with RTMPose but I couldn't get good inference speed. The camera shows FPS 30 but the detector and poses estimator are around 10 FPS. Maybe I should fix something?

These two models are used. RTMDet-nano | RTMPose-t

Tau-J commented 1 year ago

@Ben-Louis Could you give some suggestions?

Ben-Louis commented 1 year ago

Hi @moonsh, thanks for using MMPose. To expedite the inference process of RTMPose in the Webcam demo, you may consider:

  1. Eliminating the effect nodes found in the webcam config.
  2. Assigning flip_test=False in the RTMPose-t config.

By the way, could you please share details about the device you're using for pose estimation?

moonsh commented 1 year ago

Thank you @Ben-Louis I used this command: python demo/webcam_api_demo.py After the changes, the detector showed 12 FPS and the pose estimator reached around 20 FPS.

Intel Core I9-9960X, RTX-3090ti , Ubuntu 20.04

Maybe it's better to use SDK to get maximum speed?

Tau-J commented 1 year ago

Yes, the demo script is nonefficient and we will improve its performance soon. You can try SDK to get a much better speed.

moonsh commented 1 year ago

Thank you, @Tau-J I tried SDK with Python API following the tutorial. Webcam frames are feeding to the detector. The below script shows just 2 FPS.

Is this the correct approach to visualizing webcam in real-time?


from mmdeploy_python import Detector
import cv2
import time

# create an object to capture video from the webcam
cap = cv2.VideoCapture(0)
detector = Detector(model_path='../mmdeploy_model/faster-rcnn', device_name='cpu')
frame_count = 0
start_time = time.time()

# check if the webcam is opened correctly
if not cap.isOpened():
    raise IOError("Cannot open webcam")

# loop over frames from the video stream
while True:
    # read a frame from the video stream
    ret, frame = cap.read()
    frame_count += 1

    bboxes, labels, _ = detector(frame)
    # Filter the result according to threshold
    indices = [i for i in range(len(bboxes))]
    for index, bbox, label_id in zip(indices, bboxes, labels):
      [left, top, right, bottom], score = bbox[0:4].astype(int),  bbox[4]
      if score < 0.3:
          continue
    cv2.rectangle(frame, (left, top), (right, bottom), (0, 255, 0))

    cv2.putText(frame, f"Frame: {frame_count}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
    elapsed_time = time.time() - start_time
    fps = frame_count / elapsed_time
    cv2.putText(frame, f"FPS: {fps:.2f}", (10, 70), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

    cv2.imshow('Webcam Stream', frame)

    # exit if the user presses the 'q' key
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# release the resources when finished
cap.release()
cv2.destroyAllWindows()
Tau-J commented 1 year ago

Run faster-rcnn on cpu is slow, I think you can try rtmdet instead. Or maybe you can try running with your gpu

moonsh commented 1 year ago

Thank you! It works very well.