SpectacularAI / sdk-examples

Spectacular AI SDK examples
Apache License 2.0
202 stars 35 forks source link

Return point cloud and keyframes from c++? #105

Closed antithing closed 9 months ago

antithing commented 9 months ago

As above. Is it possible to return the point cloud and keyframes when running in c++ with realsense 455?

Thanks!

Bercon commented 9 months ago

Yes. The example here: https://github.com/SpectacularAI/sdk-examples/blob/ebb814784b902f89da46376dd2c8423cd63ed094/cpp/realsense/vio_mapper.cpp#L180 shows saving keyframes and images. You could easily extend the TODO: comment to also save the pointclouds

antithing commented 9 months ago

Thank you! One more question, can I return the realsense intrinsic data from the vio pipeline? (it seems the rs pipe is started in the dll, so I can't access in the usual way...)

Also, can I grab the rgb image from every frame? The mapper example seems to use keyframes only.

The goal here is to use cv::projectpoints to project the global point cloud onto every frame's image, using the camera pose.

oseiskar commented 9 months ago

You can get the undistorted intrinsics from the CameraPose object, see https://spectacularai.github.io/docs/sdk/python/latest/#spectacularAI.VioOutput.getCameraPose and https://spectacularai.github.io/docs/sdk/python/latest/#spectacularAI.Camera.getIntrinsicMatrix (C++ API has equivalent methods)

oseiskar commented 9 months ago

And you can also access the RS RGB frames like this https://github.com/SpectacularAI/sdk-examples/blob/ebb814784b902f89da46376dd2c8423cd63ed094/cpp/realsense/vio_mapper_legacy.cpp#L105C1-L163

Also note that the Camera objects have C++ helper methods (rayToPixel) that allow projecting points to the image and these actually also take the distortion coefficients into account. So if you use these instead of using getIntrinsicMatrix manually, the pixel coordinates are more correct.

antithing commented 9 months ago

Great thank you!

Is there any example code for this approach?

antithing commented 9 months ago

Hi, I am using the following to obtain the points and reproject them:

  std::vector<spectacularAI::FeaturePoint> cloud = vioOut->pointCloud;

        cv::Mat frm;
        frameLock.lock();
        frm = frameCurr;
        frameLock.unlock();

        for (int i = 0; i< cloud.size();i++)
        {
            spectacularAI::PixelCoordinates px;
          //  object_points.push_back(cv::Point3f(pp.position.x, pp.position.y, pp.position.z));
            vioOut->getCameraPose(0).camera->rayToPixel(cloud[i].position, px);

            cv::drawMarker(frm, cv::Point2f(px.x,px.y), cv::Scalar(255, 0, 0), cv::MARKER_CROSS);    

        }

However, no points are drawn. When i print cloud.size(), it changes every frame, flipping from 0, to 4, to 12, for example.

What i want is to return the full global point cloud created by the slam mapping. How can I do this in c++?

Thanks again!

oseiskar commented 9 months ago
  1. vioOut->pointCloud; is not the global point cloud. You can obtain the global point cloud by concatenating the dense point clouds from the Mapping API (see python/mapping/replay_to_nerf.py for examples, the C++ API is equivalent to the Python mapping API)
  2. Before using rayToPixel, you must transform the point from world coordinates to camera coordinates using the position and orientation in the latest CameraPose object
antithing commented 9 months ago

Thank you! Can you please give me an example of the ray to pixel transform code?

antithing commented 9 months ago

Hi @oseiskar , is this the right track for what i want to do?


 vioPipeline.setMapperCallback([&](std::shared_ptr<const spectacularAI::mapping::MapperOutput> output) {
            for (int64_t frameId : output->updatedKeyFrames) {
                auto search = output->map->keyFrames.find(frameId);
                if (search == output->map->keyFrames.end()) {
                    continue; // deleted frame
                }

                auto& frameSet = search->second->frameSet;
                auto kfCloud = search->second->pointCloud;

                if (kfCloud)
                {

                    auto cameraPose = search->second->frameSet->primaryFrame->cameraPose;
                    auto cToW = search->second->frameSet->primaryFrame->cameraPose.getCameraToWorldMatrix();
                    auto points = search->second->pointCloud->getPositionData();

                    for (int i = 0; i < kfCloud->size(); i++)
                    {
                        //multiply point by cameraToWorld.
                        spectacularAI::Matrix4d pnt ??
                        points[i] * cToW; ??
                    }

//then add the keyframe cloud to a global cloud

            }

            }

         });
oseiskar commented 9 months ago

Sure that, looks correct. We currently do not have generic matrix multiplication utilities in our code, but you could use Eigen or GLM for those in C++. You may need to convert the matrices & vectors to Eigen/GLM to do that. Alternatively, you could add a helper function for multiplication of 4x4 matrices and 3x1 vectors, which is only a few lines.

The formulas (in Eigen notation) are:

Eigen::Vector3d pointWorld = (camToWorld4x4 * pointCam.homogeneous()).hnormalized();
// ... you'll also need this if reprojecting to another frame (note: normalized, not hnormalized)
Eigen::Vector3d rayCam = (worldToCam4x4 * pointWorld.homogeneous()).head<3>().normalized();
antithing commented 9 months ago

Thank you! Can I simply do this:

                    auto cToW = search->second->frameSet->primaryFrame->cameraPose.getCameraToWorldMatrix();
                    auto points = search->second->pointCloud->getPositionData();

                    Eigen::Matrix4d camToWorld4x4(4, 4);
                    camToWorld4x4(0, 0) = cToW[0][0];
                    camToWorld4x4(1, 0) = cToW[1][0];
                    camToWorld4x4(2, 0) = cToW[2][0];
                    camToWorld4x4(3, 0) = cToW[3][0];
                    camToWorld4x4(0, 1) = cToW[0][1];
                    camToWorld4x4(1, 1) = cToW[1][1];
                    camToWorld4x4(2, 0) = cToW[2][1];
                    camToWorld4x4(3, 0) = cToW[3][1];
                    camToWorld4x4(0, 2) = cToW[0][2];
                    camToWorld4x4(1, 2) = cToW[1][2];
                    camToWorld4x4(2, 2) = cToW[2][2];
                    camToWorld4x4(3, 2) = cToW[3][2];
                    camToWorld4x4(0, 3) = cToW[0][3];
                    camToWorld4x4(1, 3) = cToW[1][3];
                    camToWorld4x4(2, 3) = cToW[2][3];
                    camToWorld4x4(3, 3) = cToW[3][3];

Or is there a row/column order required?

antithing commented 8 months ago

Thanks again! I now see points projected, but there is an inversion happening somewhere, as I don't see them until I rotate the camera 180 degrees in yaw. What am i missing?

    std::mutex mtx;
    std::vector<Eigen::Vector3d> gcloud;

    auto callback = [
        &frameCounter,
            &fileNameBuf,
            &vioSession,         
            &shouldQuit,
            &frameCurr
    ](const rs2::frame& frame)
    {
        if (shouldQuit) return;
        auto frameset = frame.as<rs2::frameset>();
        if (frameset && frameset.get_profile().stream_type() == RS2_STREAM_DEPTH) {
            auto vio = vioSession; // atomic
            if (!vio) return;                    

            rs2_stream depthAlignTarget = RS2_STREAM_COLOR;
            rs2::align alignDepth(depthAlignTarget);

            const rs2::video_frame& color = frameset.get_color_frame();

            assert(color.get_profile().format() == RS2_FORMAT_BGR8);

            // Display images for testing.
            uint8_t* colorData = const_cast<uint8_t*>((const uint8_t*)color.get_data());
            cv::Mat colorMat(color.get_height(), color.get_width(), CV_8UC3, colorData);

            frameLock.lock();
            frameCurr = colorMat.clone();
            frameLock.unlock();

        }
    };
  vioPipeline.setMapperCallback([&](std::shared_ptr<const spectacularAI::mapping::MapperOutput> output) {

            std::vector<Eigen::Vector3d> globalCloud;
            for (int64_t frameId : output->updatedKeyFrames) {
                auto search = output->map->keyFrames.find(frameId);
                if (search == output->map->keyFrames.end()) {
                    continue; // deleted frame
                }

                auto& frameSet = search->second->frameSet;
                auto kfCloud = search->second->pointCloud;

                if (kfCloud)
                {

                    auto cameraPose = search->second->frameSet->primaryFrame->cameraPose;
                    auto cToW = search->second->frameSet->primaryFrame->cameraPose.getCameraToWorldMatrix();
                    auto points = search->second->pointCloud->getPositionData();

                    Eigen::Matrix4d camToWorld4x4(4, 4);
                    camToWorld4x4(0, 0) = cToW[0][0];
                    camToWorld4x4(1, 0) = cToW[1][0];
                    camToWorld4x4(2, 0) = cToW[2][0];
                    camToWorld4x4(3, 0) = cToW[3][0];
                    camToWorld4x4(0, 1) = cToW[0][1];
                    camToWorld4x4(1, 1) = cToW[1][1];
                    camToWorld4x4(2, 0) = cToW[2][1];
                    camToWorld4x4(3, 0) = cToW[3][1];
                    camToWorld4x4(0, 2) = cToW[0][2];
                    camToWorld4x4(1, 2) = cToW[1][2];
                    camToWorld4x4(2, 2) = cToW[2][2];
                    camToWorld4x4(3, 2) = cToW[3][2];
                    camToWorld4x4(0, 3) = cToW[0][3];
                    camToWorld4x4(1, 3) = cToW[1][3];
                    camToWorld4x4(2, 3) = cToW[2][3];
                    camToWorld4x4(3, 3) = cToW[3][3];

                    for (int i = 0; i < kfCloud->size(); i++)
                    {
                        //multiply point by cameraToWorld.
                        Eigen::Vector3d pnt(points[i].x, points[i].y, points[i].z);
                        Eigen::Vector3d pointWorld = (camToWorld4x4 * pnt.homogeneous()).hnormalized();

                        globalCloud.push_back(pointWorld);
                    }

                }

            }

            mtx.lock();
            gcloud = globalCloud;
            mtx.unlock();

         });
// Start pipeline
    vioSession = vioPipeline.startSession(rsConfig, callback);

    while (true) {
        int64 t0 = cv::getTickCount();

        auto vioOut = vioSession->waitForOutput();

        int64 t1 = cv::getTickCount();
        double secs = (t1 - t0) / cv::getTickFrequency();
        std::cout << secs * 1000 << std::endl;

     //   std::cout << vioOut->asJson() << std::endl;
        std::vector<Eigen::Vector3d> cloud;
        mtx.lock();
        cloud = gcloud;
        mtx.unlock();

        cv::Mat frm;
        frameLock.lock();
        frm = frameCurr;
        frameLock.unlock();

        for (int i = 0; i < cloud.size(); i++)
        {
            spectacularAI::PixelCoordinates px;
            //  object_points.push_back(cv::Point3f(pp.position.x, pp.position.y, pp.position.z));
            vioOut->getCameraPose(0).camera->rayToPixel(spectacularAI::Vector3d(cloud[i].x(),cloud[i].y(), cloud[i].z()), px);

            cv::drawMarker(frm, cv::Point2f(px.x, px.y), cv::Scalar(255, 0, 0), cv::MARKER_CROSS);

        }

        cv::imshow("frm", frm);
        cv::waitKey(1);

    }
oseiskar commented 8 months ago

I don't see this anywhere: you should project the global points to your current camera coordinates before calling rayToPixel:

Eigen::Vector3d rayCam = (worldToCam4x4 * pointWorld.homogeneous()).head<3>().normalized();
vioOut->getCameraPose(0).camera->rayToPixel({ rayCam.x(), rayCam.y(), rayCam.z() }, px);
antithing commented 8 months ago

Adjusted to the below, gives a different result, but still incorrect, the points slide around as I move the camera, and do not align with the world objects.

       vioPipeline.setMapperCallback([&](std::shared_ptr<const spectacularAI::mapping::MapperOutput> output) {

            std::vector<Eigen::Vector3d> globalCloud;
            for (int64_t frameId : output->updatedKeyFrames) {
                auto search = output->map->keyFrames.find(frameId);
                if (search == output->map->keyFrames.end()) {
                    continue; // deleted frame
                }

                auto& frameSet = search->second->frameSet;
                auto kfCloud = search->second->pointCloud;

                if (kfCloud)
                {

                    auto cameraPose = search->second->frameSet->primaryFrame->cameraPose;
                    auto cToW = search->second->frameSet->primaryFrame->cameraPose.getCameraToWorldMatrix();
                    auto wToC = search->second->frameSet->primaryFrame->cameraPose.getWorldToCameraMatrix();

                    auto points = search->second->pointCloud->getPositionData();

                    Eigen::Matrix4d camToWorld4x4(4, 4);
                    camToWorld4x4(0, 0) = cToW[0][0];
                    camToWorld4x4(1, 0) = cToW[1][0];
                    camToWorld4x4(2, 0) = cToW[2][0];
                    camToWorld4x4(3, 0) = cToW[3][0];
                    camToWorld4x4(0, 1) = cToW[0][1];
                    camToWorld4x4(1, 1) = cToW[1][1];
                    camToWorld4x4(2, 0) = cToW[2][1];
                    camToWorld4x4(3, 0) = cToW[3][1];
                    camToWorld4x4(0, 2) = cToW[0][2];
                    camToWorld4x4(1, 2) = cToW[1][2];
                    camToWorld4x4(2, 2) = cToW[2][2];
                    camToWorld4x4(3, 2) = cToW[3][2];
                    camToWorld4x4(0, 3) = cToW[0][3];
                    camToWorld4x4(1, 3) = cToW[1][3];
                    camToWorld4x4(2, 3) = cToW[2][3];
                    camToWorld4x4(3, 3) = cToW[3][3];

                    Eigen::Matrix4d worldToCam4x4(4, 4);
                    worldToCam4x4(0, 0) = wToC[0][0];
                    worldToCam4x4(1, 0) = wToC[1][0];
                    worldToCam4x4(2, 0) = wToC[2][0];
                    worldToCam4x4(3, 0) = wToC[3][0];
                    worldToCam4x4(0, 1) = wToC[0][1];
                    worldToCam4x4(1, 1) = wToC[1][1];
                    worldToCam4x4(2, 0) = wToC[2][1];
                    worldToCam4x4(3, 0) = wToC[3][1];
                    worldToCam4x4(0, 2) = wToC[0][2];
                    worldToCam4x4(1, 2) = wToC[1][2];
                    worldToCam4x4(2, 2) = wToC[2][2];
                    worldToCam4x4(3, 2) = wToC[3][2];
                    worldToCam4x4(0, 3) = wToC[0][3];
                    worldToCam4x4(1, 3) = wToC[1][3];
                    worldToCam4x4(2, 3) = wToC[2][3];
                    worldToCam4x4(3, 3) = wToC[3][3];

                    for (int i = 0; i < kfCloud->size(); i++)
                    {
                        //multiply point by cameraToWorld.
                        Eigen::Vector3d pnt(points[i].x, points[i].y, points[i].z);
                        Eigen::Vector3d pointWorld = (camToWorld4x4 * pnt.homogeneous()).hnormalized();
                        Eigen::Vector3d rayCam = (worldToCam4x4 * pointWorld.homogeneous()).head<3>().normalized();

                        globalCloud.push_back(rayCam);
                    }

                }

            }

            mtx.lock();
            gcloud = globalCloud;
            mtx.unlock();

         });
```

// Start pipeline vioSession = vioPipeline.startSession(rsConfig, callback);

while (true) {
    int64 t0 = cv::getTickCount();

    auto vioOut = vioSession->waitForOutput();

    int64 t1 = cv::getTickCount();
    double secs = (t1 - t0) / cv::getTickFrequency();
    std::cout << secs * 1000 << std::endl;

 //   std::cout << vioOut->asJson() << std::endl;
    std::vector<Eigen::Vector3d> cloud;
    mtx.lock();
    cloud = gcloud;
    mtx.unlock();

    cv::Mat frm;
    frameLock.lock();
    frm = frameCurr;
    frameLock.unlock();

    for (int i = 0; i < cloud.size(); i++)
    {
        spectacularAI::PixelCoordinates px;

        vioOut->getCameraPose(0).camera->rayToPixel( spectacularAI::Vector3d(cloud[i].x(),cloud[i].y(), cloud[i].z()), px);

        cv::drawMarker(frm, cv::Point2f(px.x, px.y), cv::Scalar(255, 0, 0), cv::MARKER_CROSS);

    }

    cv::imshow("frm", frm);
    cv::waitKey(1);       

}
oseiskar commented 8 months ago

You cannot compute rayCam "beforehand" in the mapping output callback. It has to be computed for the latest camera pose in vioOut, i.e., use vioOut->getCameraPose(0).getWorldToCameraMatrix() -> convert to Eigen to get worldToCam4x4 (in my previousmessage) and then compute rayCam, then drawMarker.

antithing commented 8 months ago

Ah understood. Swapped to this:

 vioPipeline.setMapperCallback([&](std::shared_ptr<const spectacularAI::mapping::MapperOutput> output) {

            std::vector<Eigen::Vector3d> globalCloud;
            for (int64_t frameId : output->updatedKeyFrames) {
                auto search = output->map->keyFrames.find(frameId);
                if (search == output->map->keyFrames.end()) {
                    continue; // deleted frame
                }

                auto& frameSet = search->second->frameSet;
                auto kfCloud = search->second->pointCloud;

                if (kfCloud)
                {

                    auto cameraPose = search->second->frameSet->primaryFrame->cameraPose;
                    auto cToW = search->second->frameSet->primaryFrame->cameraPose.getCameraToWorldMatrix();

                    auto points = search->second->pointCloud->getPositionData();

                    Eigen::Matrix4d camToWorld4x4(4, 4);
                    camToWorld4x4(0, 0) = cToW[0][0];
                    camToWorld4x4(1, 0) = cToW[1][0];
                    camToWorld4x4(2, 0) = cToW[2][0];
                    camToWorld4x4(3, 0) = cToW[3][0];
                    camToWorld4x4(0, 1) = cToW[0][1];
                    camToWorld4x4(1, 1) = cToW[1][1];
                    camToWorld4x4(2, 0) = cToW[2][1];
                    camToWorld4x4(3, 0) = cToW[3][1];
                    camToWorld4x4(0, 2) = cToW[0][2];
                    camToWorld4x4(1, 2) = cToW[1][2];
                    camToWorld4x4(2, 2) = cToW[2][2];
                    camToWorld4x4(3, 2) = cToW[3][2];
                    camToWorld4x4(0, 3) = cToW[0][3];
                    camToWorld4x4(1, 3) = cToW[1][3];
                    camToWorld4x4(2, 3) = cToW[2][3];
                    camToWorld4x4(3, 3) = cToW[3][3];

                    for (int i = 0; i < kfCloud->size(); i++)
                    {
                        //multiply point by cameraToWorld.
                        Eigen::Vector3d pnt(points[i].x, points[i].y, points[i].z);
                        Eigen::Vector3d pointWorld = (camToWorld4x4 * pnt.homogeneous()).hnormalized();
                        globalCloud.push_back(pointWorld);
                    }

                }

            }

            mtx.lock();
            gcloud = globalCloud;
            mtx.unlock();

         });

    // Start pipeline
    vioSession = vioPipeline.startSession(rsConfig, callback);

    while (true) {
        int64 t0 = cv::getTickCount();

        auto vioOut = vioSession->waitForOutput();

        int64 t1 = cv::getTickCount();
        double secs = (t1 - t0) / cv::getTickFrequency();
        std::cout << secs * 1000 << std::endl;

     //   std::cout << vioOut->asJson() << std::endl;
        std::vector<Eigen::Vector3d> cloud;
        mtx.lock();
        cloud = gcloud;
        mtx.unlock();

        cv::Mat frm;
        frameLock.lock();
        frm = frameCurr;
        frameLock.unlock();

        auto wToC = vioOut->getCameraPose(0).getWorldToCameraMatrix();
       Eigen::Matrix4d worldToCam4x4(4, 4);
       worldToCam4x4(0, 0) = wToC[0][0];
       worldToCam4x4(1, 0) = wToC[1][0];
       worldToCam4x4(2, 0) = wToC[2][0];
       worldToCam4x4(3, 0) = wToC[3][0];
       worldToCam4x4(0, 1) = wToC[0][1];
       worldToCam4x4(1, 1) = wToC[1][1];
       worldToCam4x4(2, 0) = wToC[2][1];
       worldToCam4x4(3, 0) = wToC[3][1];
       worldToCam4x4(0, 2) = wToC[0][2];
       worldToCam4x4(1, 2) = wToC[1][2];
       worldToCam4x4(2, 2) = wToC[2][2];
       worldToCam4x4(3, 2) = wToC[3][2];
       worldToCam4x4(0, 3) = wToC[0][3];
       worldToCam4x4(1, 3) = wToC[1][3];
       worldToCam4x4(2, 3) = wToC[2][3];
       worldToCam4x4(3, 3) = wToC[3][3];

        for (int i = 0; i < cloud.size(); i++)
        {
            spectacularAI::PixelCoordinates px;
            Eigen::Vector3d rayCam = (worldToCam4x4 * cloud[i].homogeneous()).head<3>().normalized();

            vioOut->getCameraPose(0).camera->rayToPixel( spectacularAI::Vector3d(rayCam.x(), rayCam.y(), rayCam.z()), px);

            cv::drawMarker(frm, cv::Point2f(px.x, px.y), cv::Scalar(255, 0, 0), cv::MARKER_CROSS);

        }

        cv::imshow("frm", frm);
        cv::waitKey(1);       

    }

But still seeing points incorrectly projected.

Thank you for your patience!

oseiskar commented 8 months ago

Can't immediately see why that would not work. We will have to double-check that we don't have some incorrect coordinate conventions in our RS pipeline.

Can't promise a high priority for this at the moment, unfortunately.

If you really need to get it to work, these sort of problems can always be fixed by a coordinate convention change of the type

Eigen::Matrix4d worldToCameraFixed = A * worldToCameraOriginal * B;

where A and B are some 4x4 matrices that represent and 90 degree rotation (and possibly flip). The trick is knowing what the correct coordinate changes are and why. Usually hard to brute force as there are 1152 different possible combinations.

kaatrasa commented 8 months ago

I happened to notice a typo here:

camToWorld4x4(2, 0) = cToW[2][1];
camToWorld4x4(3, 0) = cToW[3][1];

and

worldToCam4x4(2, 0) = wToC[2][1];
worldToCam4x4(3, 0) = wToC[3][1];
antithing commented 8 months ago

Aha! that's it! Thank you. :)