Determining X,Y,Z-coordinates from color image

BarRae commented 4 years ago

Hello all,

I am currently working on a vision system to detect blue marbles. I am able to detect 1 marble from each picture. I now want to know the X,Y,Z coordinates of this marble. Here are is the image I use for detection:

Here is the output with the detected marble:

And here is my console output:

In my code I get the X,Y of the centre of the marble in the image and the distance to the marble.

The thing what I now need is The X, Y, Z coordinate of the centre of the marble. I tried doing this by using rs2_deproject_pixel_to_point(berrycord, &depth_intrin, dpt_tgt_pixel, distance); but using this it seems like my Z is equal to the depth to the object and that is not what I need. See illustration below for the value of X, Y and Z that I would like to get:

Would anyone be able to help me with this? Last week I though I got it but now i'm seriously doubting the correctness of my Z value

This is the code I currently use for this:

#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <rsutil.h>
#include <librealsense2/rs.hpp>

using namespace std;
using namespace cv;

Mat image;
Mat imgHSV;
Mat OutputImage;

int iLowH = 99;
int iHighH = 120;
int iLowS = 167;
int iHighS = 255;
int iLowV = 0;
int iHighV = 255;

int acc = 1;
int rows = 10;
int para1 = 100;
int para2 = 7;
int minRad = 3;
int maxRad = 20;

int x;
int y;

int width = 640;
int height = 480;

int input;

float berrycord[3];

bool circcheck;

static void HSVthreshold(int, int, int, int, int, int, void*)
{
    inRange(imgHSV, Scalar(iLowH, iLowS, iLowV), Scalar(iHighH, iHighS, iHighV), OutputImage);
}

static void Circle_detector(int, int, int, int, int, void*)
{
    vector<Vec3f> circles;
    HoughCircles(OutputImage, circles, HOUGH_GRADIENT, 1,
        OutputImage.rows / rows,      //change to detect circles that are closer to eachother
        para1, para2, minRad, maxRad);        //chang last to parameters to detect larger or smaller circles

    if (circles.size() == 0)
    {
        circcheck = false;
    }
    else if (circles.size() > 0)
        {
        circcheck = true;
            Vec3i c = circles.at(0);
            Point center = Point(c[0], c[1]);
            // circle center
            circle(imgHSV, center, 1, Scalar(0, 255, 0), 3, LINE_AA);
            // circle outline
            int radius = c[2];
            circle(imgHSV, center, radius, Scalar(255, 0, 0), 3, LINE_AA);

            cout << "The center of the detection is located at pixel: " << Point(c[0], c[1]) << endl;

            x = c[0];
            y = c[1];

        }

}

int main()
{
    // Contructing piplines and other stuff to receive data from the realsense camera.

    //Contruct a pipeline which abstracts the device
    rs2::pipeline pipe;     //for color

    //Create a configuration for configuring the pipeline with a non default profile
    rs2::config cfg;

    //Add desired streams to configuration
    cfg.enable_stream(RS2_STREAM_COLOR, width, height, RS2_FORMAT_BGR8, 30);
    cfg.enable_stream(RS2_STREAM_DEPTH, width, height, RS2_FORMAT_Z16, 30);

    //Instruct pipeline to start streaming with the requested configuration
    rs2::pipeline_profile selection = pipe.start(cfg);    //for color

    // Camera warmup - dropping several first frames to let auto-exposure stabilize
    rs2::frameset frames;

    for (int i = 0; i < 30; i++)
    {
        //Wait for all configured streams to produce a frame
        frames = pipe.wait_for_frames();
    }

        while (waitKey(1) < 0)
        {

        frames = pipe.wait_for_frames();
        auto depth_frame = frames.get_depth_frame();
        auto color_frame = frames.get_color_frame();
        // Make sure that both depth and  color are present for calculations
        if (!depth_frame || !color_frame)
            continue;

        //Get color each frame
        //rs2::frame color_frame = frames.get_color_frame();

        // Creating OpenCV Matrix from a color image
        Mat color(Size(width, height), CV_8UC3, (void*)color_frame.get_data(), Mat::AUTO_STEP);

        // cheking if an image was read
        if (color.empty())
        {
            cerr << "image was not generated !" << endl;
            return 1;
        }

        namedWindow("Display Image", WINDOW_AUTOSIZE);
        imshow("Display Image", color);

        //convert RGB to HSV
        cvtColor(color, imgHSV, COLOR_BGR2HSV);

        //showHSV image
        imshow("image", imgHSV);

        //Create windows
        namedWindow("image", WINDOW_AUTOSIZE); //window for original image
        namedWindow("Output", WINDOW_AUTOSIZE); //window for output mask

        //aplying color filter to HSV image
        HSVthreshold(iLowH, iHighH, iLowS, iHighS, iLowV, iHighV, 0);

        //Optional filter --> does not work properly at the moment needs more setup <--

        //morphological opening (remove small objects from the foreground)
        erode(OutputImage, OutputImage, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
        dilate(OutputImage, OutputImage, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));

        //morphological closing (fill small holes in the foreground)
        dilate(OutputImage, OutputImage, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
        erode(OutputImage, OutputImage, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));

        imshow("Output", OutputImage);

        Circle_detector(rows, para1, para2, minRad, maxRad, 0);
        imshow("image", imgHSV);

        if (circcheck == true)
        {

            auto depth_profile = depth_frame.get_profile().as<rs2::video_stream_profile>();
            auto color_profile = color_frame.get_profile().as<rs2::video_stream_profile>();

            auto depth_intrin = depth_profile.get_intrinsics();
            auto color_intrin = color_profile.get_intrinsics();
            auto depth2color_extrin = depth_profile.get_extrinsics_to(color_profile);
            auto color2depth_extrin = color_profile.get_extrinsics_to(depth_profile);

            float rgb_src_pixel[2] = { x,y }; // The RGB coordinate for the center of the marble
            float dpt_tgt_pixel[2] = { 0 }; // The depth pixel that has the best match for that RGB coordinate

            auto sensor = selection.get_device().first<rs2::depth_sensor>();
            auto scale = sensor.get_depth_scale();

            // Search along a projected beam from 0.1m to 10 meter. This can be optimized to the concrete scenario, e.g. if you know that the data is bounded within [min..max] range
            rs2_project_color_pixel_to_depth_pixel(dpt_tgt_pixel, reinterpret_cast<const uint16_t*>(depth_frame.get_data()), scale, 0.1f, 2,
                &depth_intrin, &color_intrin,
                &color2depth_extrin, &depth2color_extrin, rgb_src_pixel);

            // Verify that the depth correspondence is valid, i.e within the frame boundaries

            auto distance = depth_frame.get_distance(dpt_tgt_pixel[0], dpt_tgt_pixel[1]);
            // Get the depth value for the pixel found

            cout << "The distance to the object is: " << distance << endl;

            rs2_deproject_pixel_to_point(berrycord, &depth_intrin, dpt_tgt_pixel, distance);
            cout << "X = " << berrycord[0] << ", Y = " << berrycord[1] << ", Z = " << berrycord[2] << endl << endl;
        }

        if (circcheck == false)
        {
            berrycord[0] = 0;
            berrycord[1] = 0;
            berrycord[2] = 0;
            cout << "No object detected, setting all X, Y and Z values to zero: " << endl;
            cout << "X = " << berrycord[0] << ", Y = " << berrycord[1] << ", Z = " << berrycord[2] << endl << endl;
        }
        berrycord[0] = 0;
        berrycord[1] = 0;
        berrycord[2] = 0;
        waitKey();

        }
        return 0;
}

dorodnic commented 4 years ago

I'm not sure I fully understood: "but using this it seems like my Z is equal to the depth to the object and that is not what I need" Looking at the code, it seems correct. If it computationally feasible, I would recommend using rs2::align to bring the two frames into single viewport. This should simplify the problem significantly (we have two examples on alignment - align and align-advanced)

BarRae commented 4 years ago

The thing I meant is like this: I am going to send the X, Y, Z coordinates to a robotic arm, so it can pick up the marble. For this I need the X, Y, Z like in the sketch I drew.

I tried to get this using rs2_deproject_pixel_to_point(berrycord, &depth_intrin, dpt_tgt_pixel, distance);, but this gives me a Z which is equal to the depth value.

Or am I misunderstanding what get_distance() gives me, is that value already the Z value?

dorodnic commented 4 years ago

Hi @BarRae get_distance is providing distance from the camera plane, not camera origin, so I do believe it is the Z value you are looking for. That said, I do think the application can be simplified using the following pseudocode:

pipe.start()
align align_to(RS2_STREAM_COLOR);
while(auto fs = pipe.wait_for_frames())
{
   fs = align.process(fs);
   auto color_frame = fs.get_color_frame();
   auto depth_frame = fs.get_depth_frame();
   auto dpt_tgt_pixel = find_center_using_opencv(color_frame);
   auto distance = depth_frame.get_distance(dpt_tgt_pixel);
   rs2_deproject_pixel_to_point(berrycord, &depth_intrin, dpt_tgt_pixel, distance);
}

berrycord[2] will hold distance from camera plane. berrycord[0] and berrycord[1] will hold X and Y in relation to the color sensor. If you initialize align with RS2_STREAM_DEPTH they will hold X and Y offsets in relation to the depth sensor (left infrared camera).

BarRae commented 4 years ago

Thank you @dorodnic

This was the information I was looking for. I will probably do some testing with the camera today to check if the coordinates that the program gives me are correct enough. I will also consider your code advice.

Again, thank you.

IntelRealSense / librealsense

Determining X,Y,Z-coordinates from color image #6291