Can't merge depth frame and color frame into one frameset

IntelRealSense / librealsense

Intel® RealSense™ SDK

https://www.intelrealsense.com/

Apache License 2.0

7.53k stars 4.81k forks source link

Can't merge depth frame and color frame into one frameset #6522

Closed OldAccountFJShen closed 4 years ago

OldAccountFJShen commented 4 years ago

Required Info
Camera Model	D435
Firmware Version	05.12.03.00
Operating System & Version	Linux (Ubuntu 16)
Kernel Version (Linux Only)	Linux 4.15.0-101-generic x86_64
Platform	VMWare and Jetson TX2
SDK Version	2.34.0-0~realsense0.2250
Language	C++
Segment	others

Issue Description

I was trying to generate a frameset consisting of a depth frame and a color frame (RGB) based on the code from example software_device (https://github.com/IntelRealSense/librealsense/tree/master/examples/software-device).

The way I did it is as follows:

I declared a software_device
I added two software_sensors to the software_device
I added a depth stream and a color stream separately to the two software_sensors
I opened the streams and started a syncer
I added the extrinsics to the depth stream
I used cv::imread to read an image encoded in BGR8 and injected the image to the color sensor
I used cv::imread to read an image encoded in MONO16 and injected the image to the depth sensor
I called wait_for_frames on the syncer.

Now, the problem is that although I injected both a depth image and a color image to the device, the frameset that I received from the syncer always had a size of one. It turned out that the frameset never included a color frame!

My source code will be included in the end of this post for anyone interested.

What I want to do is to align the two images. As a matter of fact, I have previously collected depth and RGB images from the camera (their were not aligned) and I stored them as JPG/PNG files in the filesystem, and they are named by their Unix-time timestamps (e.g. 1591200967.895594238.jpg). Now I want to align the depth images' viewport to that of the color images, so I am trying to use the software_device to re-merge these depth-color image pairs into framesets and use the align processing block to align them, and then save the aligned images.

In my code, I have set the timestamps of the frames to be [idx]/60, and the frame-numbers are set to be [idx], where idx=1..120. Given the fact that the timestamps of these depth-color frame pairs are perfectly synced (identical, to be precise), the syncer should have recognized it and grouped them into a single frameset. But I never see a color frame in my frameset! This is so perplexing, can anyone give me an explanation?

#include <iostream>
#include <opencv2/opencv.hpp>
#include <librealsense2/rs.hpp>
#include <librealsense2/hpp/rs_internal.hpp>

double W = 640;
double H = 480;

int main() {

    rs2::software_device dev;

    auto depth_sensor = dev.add_sensor("Depth");
    auto color_sensor = dev.add_sensor("Color");

    rs2_intrinsics depth_intrinsics{W, H, W / 2, H / 2, 383.859, 383.859,
                                    RS2_DISTORTION_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0}};
    rs2_intrinsics color_intrinsics{W, H, W / 2, H / 2, 618.212, 618.463,
                                    RS2_DISTORTION_INVERSE_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0}};

    auto depth_stream = depth_sensor.add_video_stream(
            {RS2_STREAM_DEPTH, 0, 0, W, H, 60, 2, RS2_FORMAT_Z16,
             depth_intrinsics});
    auto color_stream = color_sensor.add_video_stream(
            {RS2_STREAM_COLOR, 0, 1, W, H, 60, 3, RS2_FORMAT_BGR8,
             color_intrinsics}, true);

    dev.create_matcher(RS2_MATCHER_DEFAULT);
    rs2::syncer sync;

    depth_sensor.open(depth_stream);
    color_sensor.open(color_stream);

    depth_sensor.start(sync);
    color_sensor.start(sync);

    depth_stream.register_extrinsics_to(color_stream,
                                        {{1, 0, 0, 0, 1, 0, 0, 0, 1},
                                         {0, 0, 0}});

    rs2::frameset fs;

    cv::Mat color_image;
    cv::Mat depth_image;

    //inject the images to the sensors
    for (int idx = 0; idx < 120; ++idx) {
        //Purpose of this for loop: somebody said you start getting the 
        // color frames once you have received an ample amount of frames
        // (e.g., 30 frames), when the auto-exposure has converged.
        //But in my case, I have run wait_for_frames for 120 times, 
        // which given the 60Hz FPS, is equivalent to 2 seconds.
        //Yet I am still not receiving any color frames in the frameset.
        color_image = cv::imread("/home/nvidia/Desktop/images/color.jpg",
                                 cv::IMREAD_COLOR);
        depth_image = cv::imread("/home/nvidia/Desktop/images/depth.png",
                                 cv::IMREAD_UNCHANGED);

        color_sensor.on_video_frame({(void*) color_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 640, 3, // Stride and Bytes-per-pixel
                                     (double) (idx) / 60, RS2_TIMESTAMP_DOMAIN_COUNT,
                                     idx, // Timestamp, Frame# for potential sync services
                                     color_stream});
        depth_sensor.on_video_frame({(void*) depth_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     2 * 640, 2, // Stride and Bytes-per-pixel
                                     (double) (idx) / 60, RS2_TIMESTAMP_DOMAIN_COUNT,
                                     idx, // Timestamp, Frame# for potential sync services
                                     depth_stream});

        fs = sync.wait_for_frames();
    }

    rs2::frame depth_frame = fs.get_depth_frame();
    rs2::frame color_frame = fs.get_color_frame();

    return 0;
}

OldAccountFJShen commented 4 years ago

As a side note, I have looked up this issue on RealSense's support community, and one guy (MartyG) claimed that the reason why color frames are absent from the frameset is that the device has sacrificed them for the purpose of setting up auto-exposure (https://support.intelrealsense.com/hc/en-us/community/posts/360037090133).

I don't buy this explanation, because if you call rs2::option::supports(rs2_option) on any of the software_sensor, you see RS2_OPTION_ENABLE_AUTO_EXPOSURE is not supported. In fact, if you run the following code on either of the software_sensors, you get the following output:

std::cout << "color_sensor:\n";
    for (int idx = 0; idx < 73; ++idx) {
        if (color_sensor.supports(rs2_option(idx)))
            std::cout << "Device option " << rs2_option(idx) << " is " << "suporrted\n";
    }

    std::cout << "\n\ndepth_sensor:\n";
    for (int idx = 0; idx < 73; ++idx) {
        if (depth_sensor.supports(rs2_option(idx)))
            std::cout << "Device option " << rs2_option(idx) << " is " << "suporrted\n";
    }

color_sensor:
Device option Frames Queue Size is suporrted

depth_sensor:
Device option Frames Queue Size is suporrted

No other options are supported!

MartyG-RealSense commented 4 years ago

Hi @FJShen Following back from the link that you provided about one of my previous answers, it is in reference to a recommendation to drop the first several frames whilst auto-exposure is stabilising.

https://github.com/IntelRealSense/librealsense/issues/2269#issuecomment-414241301

This dropping of the first several frames is not something that happens automatically and has to be programmed into a script deliberately - the user has the option not to do it. So it is a recommendation and not mandatory.

However, the auto-exposure will still take several frames to stabilise even if those frames are not dropped. You have the option to use manual exposure if you want full control over the exposure value from the first frame.

kafan1986 commented 4 years ago

@FJShen I have not created a software device yet, so can not say anything about it. But If you want to align images at a later stage and are using the openCV then you can skip the part altogether.

Step 1: Store the RGB and depth camera's intrinsic and extrinsic. Step 2: Now, read the saved images (RGB and depth) as openCV mat and align yourself.

I can post some code if you want.

ev-mp commented 4 years ago

@FJShen, you did almost everything by the book, but

(double) (idx) / 60, RS2_TIMESTAMP_DOMAIN_COUNT,

A) The timestamps are in msec, so instead of ~~/ 60~~ you should probably set *16 if you want to mimic 60 FPS. B) You put invalid enum value RS2_TIMESTAMP_DOMAIN_COUNT (check doc) - use RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME or similar.

This should be enough for you to start polling RGB and Depth frames in the loop.

Additionally: C) You sample the data only after the main loop with 120 iteration, what is the purpose of it ? In post-mortem playback scenario the syncer will provide data immediately.

RGB-Depth alignment checks: D) The intrinsic parameter look suspicious, are they real or made up? E) to align RGB to Depth you'll also have to define the depth units, (the raw depth data is unit-less) :

depth_sensor.add_read_only_option(RS2_OPTION_DEPTH_UNITS,  <0.001 ?>);

OldAccountFJShen commented 4 years ago

@FJShen, you did almost everything by the book, but

(double) (idx) / 60, RS2_TIMESTAMP_DOMAIN_COUNT,

A) The timestamps are in msec, so instead of ~/ 60~ you should probably set *16 if you want to mimic 60 FPS. B) You put invalid enum value RS2_TIMESTAMP_DOMAIN_COUNT (check doc) - use RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME or similar.

This should be enough for you to start polling RGB and Depth frames in the loop.

Additionally: C) You sample the data only after the main loop with 120 iteration, what is the purpose of it ? In post-mortem playback scenario the syncer will provide data immediately.

RGB-Depth alignment checks: D) The intrinsic parameter look suspicious, are they real or made up? E) to align RGB to Depth you'll also have to define the depth units, (the raw depth data is unit-less) :
depth_sensor.add_read_only_option(RS2_OPTION_DEPTH_UNITS,  <0.001 ?>);
@ev-mp Thanks for your comments, I will try that out. C) --> I think I was misguided by irrelevant advice about dropping the first few frames before the auto exposure stabilizes, and that was why I ran the loop for 120 times before actually collecting the frameset (and no, the color frames still were not present in the frameset). I am going to make changes to my code according to your advice and see how things go. D) --> I was in a rush when declaring the intrinsics, so I did not put all the actual values (to be specific, ppx and ppy) into my code, but I do know the correct values. I just think the intrinsics' value should not affect the syncer's behavior, which is what I am concerned about at this moment.

OldAccountFJShen commented 4 years ago

@ev-mp I have modified my code to in light of your suggestions A, B, and E; furthermore, I used the "real" intrinsics value that I obtained from get_intrinsics() (https://github.com/IntelRealSense/librealsense/issues/1706#issuecomment-389455329). But it did not work...

I set up a breakpoint right after the lines

rs2::frame depth_frame = fs.get_depth_frame();
rs2::frame color_frame = fs.get_color_frame();

and here is what I see from the debugger: ... as you can see, the color frame is just not present in the frameset.

According to the definition of rs2::frameset::get_color_frame() (attached below), color_frame will be NULL if neither the color stream nor the IR stream can provide a color frame. Since I did not add a IR stream to my software sensors, this implies that the syncer somehow just did not obtain the injected RGB image from the color stream.

/**
        * Retrieve the first color frame, if no frame is found, search for the color frame from IR stream. If one still can't be found, return an empty frame instance.
        * \return video_frame - first found color frame.
        */
        video_frame get_color_frame() const
        {
            auto f = first_or_default(RS2_STREAM_COLOR);

            if (!f)
            {
                auto ir = first_or_default(RS2_STREAM_INFRARED);
                if (ir && ir.get_profile().format() == RS2_FORMAT_RGB8)
                    f = ir;
            }
            return f;
        }

@kafan1986 I'd like to try a pure OpenCV-method if there are any feasible solutions, but I am not very familiar with the library. Would you please share some more details on how alignment is done in OpenCV?

kafan1986 commented 4 years ago

@kafan1986 I'd like to try a pure OpenCV-method if there are any feasible solutions, but I am not very familiar with the library. Would you please share some more details on how alignment is done in OpenCV?

@FJShen The below code is in C++ for my android application. That I call via NDK wrapper.

The below function takes the depthFrame and colourFrame and stores the RGB and depth camera intrinsic and extrinsic for future use.

bool setDepthCamVals(long depthFrameAddress, long colourFrameAddress) {
        if (depthCamValAlreadySet){
            return true;
        }

        rs2::frame depthFrame = reinterpret_cast<rs2_frame *>(depthFrameAddress);
        rs2::frame colourFrame = reinterpret_cast<rs2_frame *>(colourFrameAddress);

        auto depth_profile = depthFrame.get_profile().as<rs2::video_stream_profile>();
        auto colour_profile = colourFrame.get_profile().as<rs2::video_stream_profile>();

        _depth_intrin = depth_profile.get_intrinsics();
        _depth_to_color_extrin = depth_profile.get_extrinsics_to(colour_profile);
        _color_intrin = colour_profile.get_intrinsics();
        _colour_to_depth_extrin = colour_profile.get_extrinsics_to(depth_profile);

        camera_matrix = (cv::Mat_<double>(3,3) << (double)_color_intrin.fx, 0.0, (double)_color_intrin.ppx,
                0.0, (double)_color_intrin.fy, (double)_color_intrin.ppy,
                0.0, 0.0, 1.0);

        dist_coeffs = (Mat1d(1, 5) << _color_intrin.coeffs[0], _color_intrin.coeffs[1], _color_intrin.coeffs[2], _color_intrin.coeffs[3], _color_intrin.coeffs[4]);

        if (finalModeLogging) {

            char matRow1[1000] = "";
            for (int i = 0; i < 9; i++) {
                sprintf(matRow1 + strlen(matRow1), "%f ", camera_matrix.at<double>(i)); // You can use data from row you need by accessing trainingLabels.row(...).data
            }

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "camera_matrix: %s", matRow1);

            char matRow2[1000] = "";
            for (int i = 0; i < 5; i++) {
                sprintf(matRow2 + strlen(matRow2), "%f ", dist_coeffs.at<double>(i)); // You can use data from row you need by accessing trainingLabels.row(...).data
            }

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "dist_coeffs: %s", matRow2);

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_depth_intrin: width %d , height %d , ppx %f , ppy %f , fx %f , fy %f , c0 %f , c1 %f , c2 %f , c3 %f , c4 %f",
                                _depth_intrin.width,_depth_intrin.height, _depth_intrin.ppx,_depth_intrin.ppy,_depth_intrin.fx,_depth_intrin.fy,_depth_intrin.coeffs[0],_depth_intrin.coeffs[1],_depth_intrin.coeffs[2],_depth_intrin.coeffs[3],_depth_intrin.coeffs[4]);

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_color_intrin: width %d , height %d , ppx %f , ppy %f , fx %f , fy %f , c0 %f , c1 %f , c2 %f , c3 %f , c4 %f",
                                _color_intrin.width,_color_intrin.height, _color_intrin.ppx,_color_intrin.ppy,_color_intrin.fx,_color_intrin.fy,_color_intrin.coeffs[0],_color_intrin.coeffs[1],_color_intrin.coeffs[2],_color_intrin.coeffs[3],_color_intrin.coeffs[4]);

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_depth_to_color_extrin Translation Vector: [ %f , %f , %f ]",
                                _depth_to_color_extrin.translation[0],_depth_to_color_extrin.translation[1],_depth_to_color_extrin.translation[2]);
            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_depth_to_color_extrin Rotation Matrix: [ %f , %f , %f , %f , %f , %f , %f , %f , %f ]",
                                _depth_to_color_extrin.rotation[0],_depth_to_color_extrin.rotation[1],_depth_to_color_extrin.rotation[2],_depth_to_color_extrin.rotation[3],_depth_to_color_extrin.rotation[4],_depth_to_color_extrin.rotation[5],_depth_to_color_extrin.rotation[6],_depth_to_color_extrin.rotation[7],_depth_to_color_extrin.rotation[8]);

            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_colour_to_depth_extrin Translation Vector: [ %f , %f , %f ]",
                                _colour_to_depth_extrin.translation[0],_colour_to_depth_extrin.translation[1],_colour_to_depth_extrin.translation[2]);
            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "_colour_to_depth_extrin Rotation Matrix: [ %f , %f , %f , %f , %f , %f , %f , %f , %f ]",
                                _colour_to_depth_extrin.rotation[0],_colour_to_depth_extrin.rotation[1],_colour_to_depth_extrin.rotation[2],_colour_to_depth_extrin.rotation[3],_colour_to_depth_extrin.rotation[4],_colour_to_depth_extrin.rotation[5],_colour_to_depth_extrin.rotation[6],_colour_to_depth_extrin.rotation[7],_colour_to_depth_extrin.rotation[8]);
        }
        depthCamValAlreadySet = true;
        return true;
    }

The below function operates on single pixel at a time, if you want to perform for entire image, you will need to loop through all pixels.

Use the below function if you want to align depth to color image. xVal and yVal are depth pixel location and zVal is actual depth in metre.

PixelOwn mapDepthToColourPixel(double xVal, double yVal, double zVal){

        if(zVal < 0.1) {//zVal can never be less than 10 cm. Returning same coordinates as received after simple scaling
            PixelOwn tmpPixelOwn;
            tmpPixelOwn.x = (int)(xVal*_color_intrin.width/_depth_intrin.width);
            tmpPixelOwn.y = (int)(yVal*_color_intrin.height/_depth_intrin.height);
            return tmpPixelOwn;
        }

        //https://github.com/IntelRealSense/librealsense/issues/1890
        float depth_pixel[2] = {(float)(xVal-0.5), (float)(yVal-0.5)};
        float depth = (float)zVal;

        float depth_point[3], colour_point[3], colour_pixel[2];
        rs2_deproject_pixel_to_point(depth_point, &_depth_intrin, depth_pixel, depth);
        rs2_transform_point_to_point(colour_point, &_depth_to_color_extrin, depth_point);
        rs2_project_point_to_pixel(colour_pixel, &_color_intrin, colour_point);
        PixelOwn tmpPixelOwn;
        tmpPixelOwn.x = min(max(0,static_cast<int>(colour_pixel[0] + 0.5f)),_colourCamWidth-1);
        //tmpPixelOwn.x = min(max(0,static_cast<int>(colour_pixel[0])),_colourCamWidth);
        tmpPixelOwn.y = min(max(0,static_cast<int>(colour_pixel[1] + 0.5f)),_colourCamHeight-1);
        //tmpPixelOwn.y = min(max(0,static_cast<int>(colour_pixel[1])),_colourCamHeight);
        return tmpPixelOwn;
    }

Use the below function if you want to align color to depth image. xVal, yVal are colour pixel location, depthMat is the entire depth opencv Mat, depthMinMeters and depthMaxMeters gives the range of depth in which the algo will try to search for best match. Usually using depthMinMeters = 0.1 and depthMaxMeters = 10 will do fine. If your use case involves smaller distance then change these values accordingly.

PixelOwn MapColourToDepthPixelMat(double xVal, double yVal, Mat &depthMat, double depthMinMeters, double depthMaxMeters){

        float colour_pixel[2] = {(float)xVal, (float)yVal};
        float depth_pixel[2];
        rs2_project_color_pixel_to_depth_pixel(depth_pixel, reinterpret_cast<const uint16_t*>(depthMat.data), _depthScale, float(depthMinMeters), (float)depthMaxMeters, &_depth_intrin, &_color_intrin, &_colour_to_depth_extrin, &_depth_to_color_extrin, colour_pixel);
        PixelOwn tmpPixelOwn;
        tmpPixelOwn.x = min(max(0,static_cast<int>(depth_pixel[0])),_depthCamWidth-1);
        tmpPixelOwn.y = min(max(0,static_cast<int>(depth_pixel[1])),_depthCamHeight-1);
        return tmpPixelOwn;
    }

Let me know if this solves your doubt.

OldAccountFJShen commented 4 years ago

@kafan1986 It seems to me that the heart of your code lies in these three lines: rs2_deproject_pixel_to_point(depth_point, &_depth_intrin, depth_pixel, depth); rs2_transform_point_to_point(colour_point, &_depth_to_color_extrin, depth_point); rs2_project_point_to_pixel(colour_pixel, &_color_intrin, colour_point); I tried it, and it did work. The only regret is that the resulting pixels have non-integer coordinates (for example, a pixel can be located at the 3.459-th colume & 19.802-th row), which makes saving images a little hard - and OpenCV did not provide a straightforward solution to address this either (OpenCV provides a remap method which requires the mapping from the color perspective to the depth perspective. but in our case, we only know the mapping from the depth perspective to the color perspective).

I ended up storing the points in the form of a text file, then read the file in MATLAB and used griddata to interpolate the depth values on any integer "grid point", and finally stored the aligned image. It is not very fast, but nevertheless works.

MartyG-RealSense commented 4 years ago

@kafan1986 Thanks so much for your great contributions of help to the RealSense community!

@FJShen I'm very glad that you achieved a solution.

ev-mp commented 4 years ago

..I have modified my code to in light of your suggestions A, B, and E; furthermore, I used the "real" intrinsics value that I obtained from get_intrinsics() (#1706 (comment)). But it did not work...

@FJShen , Please take a look into this version of your code where I changed timestamps accroding to A and B. The only change is that I replaced the actual files you pull the data from with std::alloc

#include <iostream>
//#include <opencv2/opencv.hpp>
#include <librealsense2/rs.hpp>
#include <librealsense2/hpp/rs_internal.hpp>

int W = 640;
int H = 480;

int main() {

    rs2::software_device dev;

    auto depth_sensor = dev.add_sensor("Depth");
    auto color_sensor = dev.add_sensor("Color");

    rs2_intrinsics depth_intrinsics{ W, H, W / 2, H / 2, 383.859, 383.859,
                                    RS2_DISTORTION_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0} };
    rs2_intrinsics color_intrinsics{ W, H, W / 2, H / 2, 618.212, 618.463,
                                    RS2_DISTORTION_INVERSE_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0} };

    auto depth_stream = depth_sensor.add_video_stream(
        { RS2_STREAM_DEPTH, 0, 0, W, H, 60, 2, RS2_FORMAT_Z16,
         depth_intrinsics });
    auto color_stream = color_sensor.add_video_stream(
        { RS2_STREAM_COLOR, 0, 1, W, H, 60, 3, RS2_FORMAT_BGR8,
         color_intrinsics }, true);

    dev.create_matcher(RS2_MATCHER_DEFAULT);
    rs2::syncer sync;

    depth_sensor.open(depth_stream);
    color_sensor.open(color_stream);

    depth_sensor.start(sync);
    color_sensor.start(sync);

    depth_stream.register_extrinsics_to(color_stream,
        { {1, 0, 0, 0, 1, 0, 0, 0, 1},
         {0, 0, 0} });

    rs2::frameset fs;

    std::vector<uint8_t> color_image(W * H * 3) ;// cv::imread("/home/nvidia/Desktop/images/color.jpg", cv::IMREAD_COLOR);
    std::vector<uint8_t> depth_image(W * H * 2);// cv::imread("/home/nvidia/Desktop/images/color.jpg", cv::IMREAD_COLOR);

    //inject the images to the sensors
    for (int idx = 0; idx < 120; ++idx) {
        //Purpose of this for loop: somebody said you start getting the 
        // color frames once you have received an ample amount of frames
        // (e.g., 30 frames), when the auto-exposure has converged.
        //But in my case, I have run wait_for_frames for 120 times, 
        // which given the 60Hz FPS, is equivalent to 2 seconds.
        //Yet I am still not receiving any color frames in the frameset.

        color_sensor.on_video_frame({ (void*)color_image.data(), // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 640, 3, // Stride and Bytes-per-pixel
                                     (double)(idx) * 60, RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME, // RS2_TIMESTAMP_DOMAIN_COUNT
                                     idx, // Timestamp, Frame# for potential sync services
                                     color_stream });
        depth_sensor.on_video_frame({ (void*)depth_image.data(), // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     2 * 640, 2, // Stride and Bytes-per-pixel
                                     (double)(idx) * 60, RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME, // RS2_TIMESTAMP_DOMAIN_COUNT
                                     idx, // Timestamp, Frame# for potential sync services
                                     depth_stream });

        fs = sync.wait_for_frames();
        rs2::frame depth_frame = fs.get_depth_frame();
        rs2::frame color_frame = fs.get_color_frame();
        if (depth_frame)
            std::cout << "fn depth : " << depth_frame.get_frame_number() << " ts: " << depth_frame.get_timestamp();
        if (color_frame)
            std::cout << ", fn color : " << color_frame.get_frame_number() << " ts: " << color_frame.get_timestamp();
        std::cout << std::endl;
    }

    return 0;
}

It produces both Depth and RGB:

fn depth : 0 ts: 0
fn depth : 1 ts: 60, fn color : 1 ts: 60
fn depth : 2 ts: 120, fn color : 2 ts: 120
fn depth : 3 ts: 180, fn color : 3 ts: 180
fn depth : 4 ts: 240, fn color : 4 ts: 240
fn depth : 5 ts: 300, fn color : 5 ts: 300
fn depth : 6 ts: 360, fn color : 6 ts: 360
fn depth : 7 ts: 420, fn color : 7 ts: 420
fn depth : 8 ts: 480, fn color : 8 ts: 480
fn depth : 9 ts: 540, fn color : 9 ts: 540
fn depth : 10 ts: 600, fn color : 10 ts: 600
fn depth : 11 ts: 660, fn color : 11 ts: 660
fn depth : 12 ts: 720, fn color : 12 ts: 720
fn depth : 13 ts: 780, fn color : 13 ts: 780
fn depth : 14 ts: 840, fn color : 14 ts: 840
fn depth : 15 ts: 900, fn color : 15 ts: 900
fn depth : 16 ts: 960, fn color : 16 ts: 960
fn depth : 17 ts: 1020, fn color : 17 ts: 1020
fn depth : 18 ts: 1080, fn color : 18 ts: 1080
fn depth : 19 ts: 1140, fn color : 19 ts: 1140
fn depth : 20 ts: 1200, fn color : 20 ts: 1200
fn depth : 21 ts: 1260, fn color : 21 ts: 1260
fn depth : 22 ts: 1320, fn color : 22 ts: 1320

OldAccountFJShen commented 4 years ago

@ev-mp Thanks for your reply. I tried it, and yes, when I "run" the code without any breakpoints the output looks just like yours; but once I place a breakpoint before this line: std::cout << std::endl; ... and run it in debug mode, the output looks like this:

fn depth : 0 ts: 0
fn depth : 1 ts: 16
fn depth : 2 ts: 32

And if you print fs.size(), you will discover that the output is different: with debug mode on, fs.size() is always 1; whereas when you simply 'run' the program, fs.size() is 2.

Hmmm...

Regardless of this strange behavior, I decided to assume that the frameset is all right, proceed to do the alignment, and save the aligned image:

rs2::align align(RS2_STREAM_COLOR);
//...
for(int idx=0; idx<120; ++idx){
        //inject frames to sensors
        color_sensor.on_video_frame({(void*) color_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 640, 3, // Stride and Bytes-per-pixel
                                     double(idx * 16), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     color_stream});
        depth_sensor.on_video_frame({(void*) depth_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     2 * 640, 2, // Stride and Bytes-per-pixel
                                     double(idx * 16), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     depth_stream});

        fs = sync.wait_for_frames();
        if (fs.size() == 2) {
            fs = align.process(fs);
            rs2::frame depth_frame = fs.get_depth_frame();

            //write aligned image to file
            cv::Mat aligned_image(480, 640, CV_16UC1, (void*) (depth_frame.get_data()), 2 * 640);
            cv::imwrite("aligned" + std::to_string(idx) + ".png", aligned_image);
        }
}

What I see after loading the PNG file into Matlab is this: Briefly speaking, all elements expects [0,0] are zero (I checked it), which is obviously not what you get when you are using rs2::align in real-time.

I think either I am not properly configuring everything in my program, or there are some uncanny details of the SDK that everyone have overlooked. For now I am just going to use the low-level librealsense2/rsutil.h header method proposed by kafan1986 and then do the grid interpolation in MATLAB. I am still open to discussion on why rs2::align is failing.

jb455 commented 4 years ago

@FJShen - I had a similar issue to you a while ago - see #4523. The solution at the time was to add depth_sensor.add_read_only_option(RS2_OPTION_STEREO_BASELINE, depthUnits); so that may be worth a try, but it was nearly a year ago I first saw the problem so maybe that bug has been fixed and there is a new issue now.

OldAccountFJShen commented 4 years ago

@jb455 Thank you!! Your suggestion turned out to be the magic ingredient that worked, though it works in a way that seems very peculiar to me... I checked rs_option.h and here's the documentation for this specific option:

RS2_OPTION_STEREO_BASELINE /**< The distance in mm between the first and the second imagers in stereo-based depth cameras*/

In my case the depth unit is 0.001 meters, so we are effectively declaring that "the distance in mm between the first and the second imagers" is 0.001 meters, which clearly is not the case. Is this really not a bug from the code? @MartyG-RealSense

OldAccountFJShen commented 4 years ago

Here's the code that worked out for me in the end -- in case there's s.b. who is interested to refer to it (and thanks to everybody who contributed to this issue):

#include <iostream>
#include <opencv2/opencv.hpp>
#include <librealsense2/rs.hpp>
#include <librealsense2/hpp/rs_internal.hpp>

double W = 640;
double H = 480;

int main() {

    rs2::software_device dev;

    auto depth_sensor = dev.add_sensor("Depth");
    auto color_sensor = dev.add_sensor("Color");

    rs2_intrinsics depth_intrinsics{W, H, 315.505, 239.526, 383.859, 383.859,
                                    RS2_DISTORTION_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0}};
    rs2_intrinsics color_intrinsics{W, H, 320.243, 237.202, 618.212, 618.463,
                                    RS2_DISTORTION_INVERSE_BROWN_CONRADY,
                                    {0, 0, 0, 0, 0}};

    auto depth_stream = depth_sensor.add_video_stream(
            {RS2_STREAM_DEPTH, 0, 0, W, H, 60, 2, RS2_FORMAT_Z16,
             depth_intrinsics});
    auto color_stream = color_sensor.add_video_stream(
            {RS2_STREAM_COLOR, 0, 1, W, H, 60, 3, RS2_FORMAT_BGR8,
             color_intrinsics});

    depth_sensor.add_read_only_option(RS2_OPTION_DEPTH_UNITS, 0.001f);
    depth_sensor.add_read_only_option(RS2_OPTION_STEREO_BASELINE, 0.001f);

    depth_stream.register_extrinsics_to(color_stream,
                                        {{0.999924,  -0.0103129,   -0.00680374,
                                                 0.0103063, 0.999946, -0.00100439,
                                                 0.00681374, 0.000934194, 0.999976},
                                         {0.0146439, -8.40931e-05, 0.000575048}});

    dev.create_matcher(RS2_MATCHER_DEFAULT);
    rs2::syncer sync;

    depth_sensor.open(depth_stream);
    color_sensor.open(color_stream);

    depth_sensor.start(sync);
    color_sensor.start(sync);

    rs2::align align(RS2_STREAM_COLOR);

    rs2::frameset fs;
    rs2::frame depth_frame;
    rs2::frame color_frame;

    cv::Mat color_image;
    cv::Mat depth_image;

    int idx = 0;

    //inject the images
    for (idx = 0; idx < 3; ++idx) {
        color_image = cv::imread("/home/nvidia/Desktop/images/color.jpg",
                                 cv::IMREAD_COLOR);
        depth_image = cv::imread("/home/nvidia/Desktop/images/depth.png",
                                 cv::IMREAD_UNCHANGED);

        color_sensor.on_video_frame({(void*) color_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 640, 3, // Stride and Bytes-per-pixel
                                     double(idx * 16), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     color_stream});
        depth_sensor.on_video_frame({(void*) depth_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     2 * 640, 2, // Stride and Bytes-per-pixel
                                     double(idx * 16), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     depth_stream});

        fs = sync.wait_for_frames();
        if (fs.size() == 2) {
            fs = align.process(fs);
            rs2::frame depth_frame = fs.get_depth_frame();
            rs2::frame color_frame = fs.get_color_frame();

            cv::Mat aligned_image(480, 640, CV_16UC1, (void*) (depth_frame.get_data()), 2 * 640);
            cv::imwrite("aligned" + std::to_string(idx) + ".png", aligned_image);
        }

    }

    return 0;
}

MartyG-RealSense commented 4 years ago

@FJShen Great news that you found a solution, and thanks for sharing your code with the community!

jb455 commented 4 years ago

@jb455 Thank you!! Your suggestion turned out to be the magic ingredient that worked, though it works in a way that seems very peculiar to me... I checked rs_option.h and here's the documentation for this specific option:
RS2_OPTION_STEREO_BASELINE /**< The distance in mm between the first and the second imagers in stereo-based depth cameras*/
In my case the depth unit is 0.001 meters, so we are effectively declaring that "the distance in mm between the first and the second imagers" is 0.001 meters, which clearly is not the case. Is this really not a bug from the code? @MartyG-RealSense

Glad to see it worked for you, though it's a long time since I raised the issue so I had hoped it would've been fixed by now... I've no idea why it works either, it took lots of stepping through the API and trial and error to find the workaround.

MartyG-RealSense commented 4 years ago

I will now close this case, as a solution was achieved. Thanks everybody!

ilyak93 commented 2 years ago

I worked on SDK 2.49.

The code in the last post of @OldAccountFJShen didn't worked for me, I got only one frame, the depth. Then I fixed the time argument to be the same values as the fps (for me its 30) in that peace of code:

double(idx * that_should_be_fps)

color_sensor.on_video_frame({(void*) color_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 1280, 3, // Stride and Bytes-per-pixel
                                     double(idx * 30), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     color_stream});
depth_sensor.on_video_frame({(void*) depth_image.data, // Frame pixels from capture API
                             [](void*) {}, // Custom deleter (if required)
                             2 * 1280, 2, // Stride and Bytes-per-pixel
                             double(idx * 30), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                             idx, // Timestamp, Frame# for potential sync services
                             depth_stream});

And I got 2 frames and the align operation gave me the black depth images as a result, similarly to what @OldAccountFJShen showed in his matlab screenshot.

depth_sensor.add_read_only_option(RS2_OPTION_STEREO_BASELINE, depthUnits); this line is already appears in the code, but I wonder is it correct due to the fact that I also needed to fix the above parameter.

@OldAccountFJShen, can you please check if it is ? I hoped that this idea to sue software-device for alignment will work for me too. In addition could you post the matlab code with the code that does the matching of depth to color that you used ? Just so that be another solution, if this will not work, although I hope you just posted the wrong code or something, thank you in advance.

ilyak93 commented 2 years ago

@RealSenseSupport, can you please check why this isn't working anymore ? It worked in SDK 2.34 Just checked it out again on that SDK, works great there. That way of alignment is an valuable asset, I hope you will make it work again on your newest SDK's versions. Me best regards.

@jb455, maybe you can notice the code changes that led to this code not be working anymore ? It seemed at a time that you were qualified and sharp enough to came out with this exact solution of setting the stereo_baseline thing.

ilyak93 commented 2 years ago

@MartyG-RealSense, another interesting question, can I use realsense software_device as in this example but with frames from other cameras (not real sense at all), suppose real sense with thermal camera, where the resolutions are different but I know the intrinsics and extrinsics from calibration ?

Will it work for frames different resolutions, for example real sense 1280 x 720 and other camera of 640 x 512 ?

MartyG-RealSense commented 2 years ago

My understanding is that you could use RGB color data from non-RealSense cameras in software_device if you feed the frames into OpenCV first, as mentioned in point 3 of advice provided by a RealSense team member at https://github.com/IntelRealSense/librealsense/issues/2249#issuecomment-413492867

There was also a discussion - unrelated to software_device - at https://github.com/IntelRealSense/librealsense/issues/6801 about performing hardware sync between a RealSense camera and a Flir camera.

https://github.com/IntelRealSense/librealsense/issues/4202 and https://github.com/IntelRealSense/librealsense/issues/10067 discuss performing alignment between data from RealSense and non-RealSense cameras with software_device and how the differences between the sources are adjusted for.

In https://github.com/IntelRealSense/librealsense/issues/4202 a D410 RealSense camera (which would have a maximum depth resolution of 1280x720 and has no RGB sensor) was being aligned with a 4K RGB resolution non-RealSense camera.

ilyak93 commented 2 years ago

@MartyG-RealSense, thank you very much, I'll investigate the issue.

ndaley7 commented 5 months ago

rs_option.h :https://github.com/IntelRealSense/librealsense/issues/6522#issuecomment-641027578

As mentioned here: https://github.com/IntelRealSense/librealsense/issues/6522#issuecomment-1052598247

It does not seem like this solution works as of 2.49. I'm going to try the software device example and see if that is functioning correctly.