openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.02k stars 2.21k forks source link

Second openvino call always laggy on Myriad (NCS stick 2) #2068

Closed ballchuuu closed 4 years ago

ballchuuu commented 4 years ago

Hi,

I have the following code snippet below. However, after invoking it the second time on my command line (i.e. python3 file.py), it will take 400s for inference compared to the first time which takes 30s. May I know why? Would I have to clear the memory on my NCS stick first?

Also is it normal that my NCS stick is taking longer than the time my RPI 3B+ (without NCS stick) takes to complete the inference?

from __future__ import division
import cv2
import time
import numpy as np
import math

binFile = "model/f.bin"
xmlFile = "model/f.xml"

print("file1")

nPoints = 22
POSE_PAIRS = [ [0,1],[1,2],[2,3],[3,4],[0,5],[5,6],[6,7],[7,8],[0,9],[9,10],[10,11],[11,12],[0,13],[13,14],[14,15],[15,16],[0,17],[17,18],[18,19],[19,20] ]

print("just before cv2 read Net")
net = cv2.dnn.readNet(xmlFile, binFile)

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)

print("set all targets")

frame = cv2.imread("wew.jpg")
print("read image")
frameCopy = np.copy(frame)
frameWidth = frame.shape[1]
frameHeight = frame.shape[0]
aspect_ratio = frameWidth/frameHeight

threshold = 0.1

t = time.time()
# input image dimensions for the network
inHeight = 368
inWidth = int(((aspect_ratio*inHeight)*8)//8)
inpBlob = cv2.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight), (0, 0, 0), swapRB=False, crop=False)

net.setInput(inpBlob)

output = net.forward()
print("time taken by network : {:.3f}".format(time.time() - t))

# Empty list to store the detected keypoints
t= []

for i in range(nPoints):
    # confidence map of corresponding body's part.
    probMap = output[0, i, :, :]
    probMap = cv2.resize(probMap, (frameWidth, frameHeight))

    # Find global maxima of the probMap.
    minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)

    if prob > threshold :
        cv2.circle(frameCopy, (int(point[0]), int(point[1])), 8, (0, 255, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.putText(frameCopy, "{}".format(i), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, lineType=cv2.LINE_AA)
        t.append(i)

print(t)

cv2.imwrite('Output-Keypoints.jpg', frameCopy)
cv2.imwrite('Output-Skeleton.jpg', frame)

print("Total time taken : {:.3f}".format(time.time() - t))
cv2.destroyAllWindows()

Thank you!!!

dkurt commented 4 years ago

Please follow issue creation guidelines and provide all necessary details: OpenVINO/OpenCV version, HW spec (is that observed only on RPI3 ?). Please report bugs properly.

ballchuuu commented 4 years ago

Please follow issue creation guidelines and provide all necessary details: OpenVINO/OpenCV version, HW spec (is that observed only on RPI3 ?). Please report bugs properly.

OpenVino/opencv version = "4.4.0-openvino" Installed: 2020.4.287 version

HW Spec: Rpi 3B+, Raspbian Stretch

Sorry and thanks!

dkurt commented 4 years ago

400s

ballchuuu commented 4 years ago
  • Please localize time estimation - measure only inference part.

400s

  • 400 seconds?

Yes the inference took 400 seconds with the IR Model and NCS stick 2.

  • Please provide output for lsusb 3 times in the following order:
lsusb
python3 file.py
lsusb
python3 file.py
lsusb
lsusb
Bus 001 Device 044: ID 03e7:f63b Intel
Bus 001 Device 005: ID 0424:7800 Standard Microsystems Corp.
Bus 001 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

python3 file.py
time taken by network: 427.131 (seconds)

lsusb
Bus 001 Device 055: ID 03e7:2485 Intel Movidius MyriadX
Bus 001 Device 005: ID 0424:7800 Standard Microsystems Corp.
Bus 001 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

python3 file.py
time taken by network: 430.017(seconds)

lsusb
Bus 001 Device 057: ID 03e7:2485 Intel Movidius MyriadX
Bus 001 Device 005: ID 0424:7800 Standard Microsystems Corp.
Bus 001 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
dkurt commented 4 years ago

Which input resolution is used?

I'd like to recommend to take a look at https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/human-pose-estimation-0001. 400 seconds is something you probably not going to work with :)

ballchuuu commented 4 years ago

Which input resolution is used?

The resolution is frame size rounded up from 720x480 to 736x480 width, height, fwidth, fheight)))

I'd like to recommend to take a look at https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/human-pose-estimation-0001. 400 seconds is something you probably not going to work with :)

I need the hand keypoint pose estimation (i.e. including fingers and palm) but the pretrained one only offers the body one! Is there by any chance one for hand keypoint detection?

Also find it weird as I can achieve 30 seconds when I tried to run the model using the native API (i.e. IECore)

Thank you so much for your help!

maxlytkin commented 4 years ago

@ballchuuu The closest one to your request would be ASL Recognition model. I think it should be pre-trained on hand gestures. And do you see the same second call slow operation behavior when you run your model and input with Benchmark_app? And with other models/apps like the ASL model mentioned and ASL Recognition Demo?

maxlytkin commented 4 years ago

I'm closing this case. Please feel free to re-open if additional assistance is needed.