Realtime Object Recognition on RGB-D images !!

Rahul-Venugopal commented 5 years ago

Required Info
Camera Model	{ ZR300 / D400 }
Firmware Version	(Open RealSense Viewer --> Click info)
Operating System & Version	{Linux (Ubuntu 14/16/17)
Kernel Version (Linux Only)	(e.g. 4.14.13)
Platform	PC.
SDK Version	{ legacy / 2.<?>.<?> }
Language	{python/openCV }
Segment	{PC }

Issue Description

I am trying to implement Yolov3( for real time recognition using RealSense ZR 300 ) and add the depth data to the output.

As of now , I am able to detect object using RGB images and run the detector on each frame using following code.

test.txt

Recently I have seen the notebook from this repository

In the repository the depth is calculated from a recorded RGBD image. But I would like to run this in real time.

Is it possible to simultaneously detect the object on RGB image and calculate the distance to the centre point of bounding box using Depth image from Intel RealSense ZR300 .

If possible , any help regarding the same would be very helpful.

Thanks Rahul

pedrombmachado commented 5 years ago

Hi @Rahul-Venugopal , These scripts should help you... you will need to install opencv4

Rahul-Venugopal commented 5 years ago

Hi @pedrombmachado ,

Thanks for such a quick reply. I will try this out.

In those scripts , they are for D435 I gues ...will it be same for ZR300 also ?

pedrombmachado commented 5 years ago

The scripts with camera/D435 and D435I. it will be the same for the ZR300

Rahul-Venugopal commented 5 years ago

Actually I am looking for something similar to the following functions

main(argv)
getObjectDepth(depth, bounds)

from the following script

darknet_zed.txt

def main(argv):

    thresh = 0.25
    darknet_path="../libdarknet/"
    configPath = darknet_path + "cfg/yolov3-tiny.cfg"
    weightPath = "yolov3-tiny.weights"
    metaPath = "coco.data"
    svoPath = None

    help_str = 'darknet_zed.py -c <config> -w <weight> -m <meta> -t <threshold> -s <svo_file>'
    try:
        opts, args = getopt.getopt(
            argv, "hc:w:m:t:s:", ["config=", "weight=", "meta=", "threshold=", "svo_file="])
    except getopt.GetoptError:
        print (help_str)
        sys.exit(2)
    for opt, arg in opts:
        if opt == '-h':
            print (help_str)
            sys.exit()
        elif opt in ("-c", "--config"):
            configPath = arg
        elif opt in ("-w", "--weight"):
            weightPath = arg
        elif opt in ("-m", "--meta"):
            metaPath = arg
        elif opt in ("-t", "--threshold"):
            thresh = float(arg)
        elif opt in ("-s", "--svo_file"):
            svoPath = arg

    init = sl.InitParameters()
    init.coordinate_units = sl.UNIT.UNIT_METER
    if svoPath is not None:
        init.svo_input_filename = svoPath

    cam = sl.Camera()
    if not cam.is_opened():
        print("Opening ZED Camera...")
    status = cam.open(init)
    if status != sl.ERROR_CODE.SUCCESS:
        print(repr(status))
        exit()

    runtime = sl.RuntimeParameters()
    # Use STANDARD sensing mode
    runtime.sensing_mode = sl.SENSING_MODE.SENSING_MODE_STANDARD
    mat = sl.Mat()
    point_cloud_mat = sl.Mat()

def getObjectDepth(depth, bounds):
    area_div = 2

    x_vect = []
    y_vect = []
    z_vect = []

    for j in range(int(bounds[0] - area_div), int(bounds[0] + area_div)):
        for i in range(int(bounds[1] - area_div), int(bounds[1] + area_div)):
            z = depth[i, j, 2]
            if not np.isnan(z) and not np.isinf(z):
                x_vect.append(depth[i, j, 0])
                y_vect.append(depth[i, j, 1])
                z_vect.append(z)
    try:
        x = statistics.median(x_vect)
        y = statistics.median(y_vect)
        z = statistics.median(z_vect)
    except Exception:
        x = -1
        y = -1
        z = -1
        pass

    return x, y, z

I am not able to find anything similar in the scripts you shared @pedrombmachado . Please correct me if I am wrong.

pedrombmachado commented 5 years ago

I got mine from here.

if not cam.is_opened(): print("Opening ZED Camera...") status = cam.open(init)

This is for the ZED camera and it is using a naive cature function. My function uses multi-threads approach on regular cameras. That source code is meant to capture the images from the camera as oposed to take adavantage of the pyrealsense library. Also be careful with this part of the source code

x = statistics.median(x_vect) y = statistics.median(y_vect) z = statistics.median(z_vect)

The best option is to use the built in functions

                # Wait for a coherent pair of frames: depth and color
                frames = pipeline.wait_for_frames()

                # Align the depth frame to color frame
                aligned_frames = align.process(frames)

                # Get aligned frames
                aligned_depth_frame = aligned_frames.get_depth_frame()
                color_aligned_frame = aligned_frames.get_color_frame()
                coverage = np.zeros((w,h))
                    for xx in range(x,x+w):
                        for yy in range(y,y+h):
                            coverage[xx-x,yy-y] = aligned_depth_frame.get_distance(xx, yy)
                    print(coverage)
                    depth_matrix = coverage.mean()

The intensions are good but the median may be 0 depending on the clip_offset.

IntelRealSense / librealsense

Realtime Object Recognition on RGB-D images !! #3458

Issue Description