Bounding Box Incorrect Scaling/Positioning after 97e92fa

Luxonis-Brandon commented 5 years ago

So first off thanks again for the tuned performance. I verified 12 FPS peak on the Raspberry Pi with the Realsense D435.

The thing I noticed (see videos below), is that the scaling/positioning math behind the bounding boxes after 97e92fa (here) seem to be off.

So prior to this commit, they follow me left/right in the live video feed correctly (my center, and the center of the bounding box match well), and the scale matches well too (i.e. my width/height, and the bounding box width/height match well).

After the commit, I notice the following:

The positioning is off: when object is on the left side, the position of the bounding box is too far to the left. When the object is on the right, the center of the bounding box is too far to the right, etc.
The left/right size is also off. The box is too large at least in the left/right dimension (and probably also up/down, it seems, but less tested).

Here's a before/correct video (from checkout 80a564cfb02afec5ee3a4ddfa9493c2e3e7d2cff): https://photos.app.goo.gl/H5ta1X7KnqTdsz7j7

Notice that the bounding box tracks my center well and also my size well.

And here's the after/incorrect video (from checkout 055636afe67f3fb66e4aff46a16899719571038e): https://photos.app.goo.gl/6zfMgGKaYjbrZRQZ9

Notice in this case the box is too wide (and also seemingly too tall), and goes to far left/right when the object (me) is located left/right of the video feed.

Thanks again! I'll also try to hunt to see if I can find where this scale/position error was introduced.

Luxonis-Brandon commented 5 years ago

So my first guess (which has a high probability of being wrong because I don't know the codebase yet) is that the image preprocessing changed. And now it's doing this logic on the image if it's realsense:

def image_preprocessing(self, color_image):

        if self.camera_mode == 0:
            prepimg = cv2.resize(color_image,(532,400))
            prepimg = prepimg[100:100+300,116:116+300]

Instead of (if I'm reading it right), the following logic:

def preprocess_image(src):

    try:
        img = cv2.resize(src, (300, 300))
        img = img - 127.5
        img = img * 0.007843
        img = img[np.newaxis, :, :, :]     # Batch size axis add
        img = img.transpose((0, 3, 1, 2))  # NHWC to NCHW
        return img

And that's because self.image_preprocessing(self.frameBuffer.get()) is called instead of the (removed in the commit) preprocess_image(color_image).

Again, just my guess so far...

Luxonis-Brandon commented 5 years ago

I tried replacing that logic and AFAICT (as far as I can tell) that was indeed the problem.

Below is a video of with the following logic for image_preprocessing:

def image_preprocessing(self, color_image):
        prepimg = cv2.resize(color_image, (300, 300))
        prepimg = prepimg - 127.5
        prepimg = prepimg * 0.007843
        prepimg = prepimg[np.newaxis, :, :, :]     # Batch size axis add
        prepimg = prepimg.transpose((0, 3, 1, 2))  # NHWC to NCHW

https://photos.app.goo.gl/SVqpCogZ7J234Kkp9

I did notice on this that the lag/latency is much higher, at least compared to what I remember for the NCS1/NCSDK version, and I'm not sure if it's my change above that's causing that latency.

Best, Brandon

Luxonis-Brandon commented 5 years ago

Watching the videos above, it seems like the latency is just as present before my change compared to after. I'm going to upload a video of the NCS1 version for comparison WRT latency in a new Github issue, and then @PINTO0309 let me know if you want me to close this issue (the scale/position error, as it seems to just be what I found).

Thanks! -Brandon

PINTO0309 commented 5 years ago

@Luxonis-Brandon

Thank you for always contributing! It seems that I put in a bug as you suggested. Can you try the following command?

$ python3 MultiStickSSDwithRealSense_OpenVINO_NCS2.py -mod 0 -numncs 1

However, perhaps this fix does not improve the latency problem. I also possess both NCS1 and NCS2, so I will try it a little.

btw, Image resizing (640, 480) -> (300, 300) is very high load.

PINTO0309 / MobileNet-SSD-RealSense

Bounding Box Incorrect Scaling/Positioning after 97e92fa #13