stella-cv / stella_vslam

This is a unofficial fork of OpenVSLAM (https://github.com/xdspacelab/openvslam)
https://stella-cv.rtfd.io/en/latest/
Other
897 stars 387 forks source link

Marker integration #242

Open ymd-stella opened 2 years ago

ymd-stella commented 2 years ago

What issue is the feature request related to?

The application of monocular Visual SLAM is limited because the scale is not known. Marker integration is an inexpensive solution to this problem without the need for additional cameras. This feature is inspired by UcoSLAM.

In addition, by applying marker integration to an equirectangular model, robust Visual SLAM with scale can be realized.

Describe the solution you'd like

See draft.

How to achieve this

Additional context

mamo3gr commented 2 years ago

I'm interested in scale-specified VSLAM and this feature. I have read https://github.com/stella-cv/stella_vslam/pull/241 and wonder how to employ the scale information, that is, the physical size of the ArUco marker. I think marker_model::aruco::width_ would be used to determine the scale of the camera's trajectory (translation vector) or make some kind of constraints, but it seems not to appear in codes. Would you mind pointing out the corresponding section?

Thank you for your cooperation.

ymd-stella commented 2 years ago

In the first attempt, distance constraints was added between the corners of the markers, but adding such constraints was computationally inefficient. So instead, the corners of the markers are now fixed. This determines the scale. (When the coordinates of the reference keyframe change, the position of the marker should change, but it does not. This is a bug. I will fix it soon. -> #311)

  1. Calculates the relative coordinates of corners.

https://github.com/stella-cv/stella_vslam/blob/3d178c02ed26057c4e41fd8416f18b161d7a68b1/src/stella_vslam/marker_model/base.cc#L15-L18

  1. Calculate the relative coordinates between the camera and the marker by referring to the relative coordinates of the corners.

https://github.com/stella-cv/stella_vslam/blob/3d178c02ed26057c4e41fd8416f18b161d7a68b1/src/stella_vslam/marker_detector/base.cc#L42

  1. Calculates the position of the corners on the map.

https://github.com/stella-cv/stella_vslam/blob/c7feabd7c741eceb45dff74f5da2772c91e20e25/src/stella_vslam/module/initializer.cc#L250

https://github.com/stella-cv/stella_vslam/blob/c7feabd7c741eceb45dff74f5da2772c91e20e25/src/stella_vslam/module/keyframe_inserter.cc#L111

https://github.com/stella-cv/stella_vslam/blob/3d178c02ed26057c4e41fd8416f18b161d7a68b1/src/stella_vslam/data/marker2d.cc#L10-L18

  1. Optimization is performed by reprojection edges.

https://github.com/stella-cv/stella_vslam/blob/3d178c02ed26057c4e41fd8416f18b161d7a68b1/src/stella_vslam/optimize/internal/marker_vertex_container.h#L66

https://github.com/stella-cv/stella_vslam/blob/c7feabd7c741eceb45dff74f5da2772c91e20e25/src/stella_vslam/optimize/global_bundle_adjuster.cc#L129-L164

https://github.com/stella-cv/stella_vslam/blob/3d178c02ed26057c4e41fd8416f18b161d7a68b1/src/stella_vslam/optimize/local_bundle_adjuster.cc#L213-L252

mamo3gr commented 2 years ago

Thank you very much for your rapid response and detailed description.

I understand that the scale information is employed as below:

By the way, without the size-known marker, is there any rule to determine the unit length of the estimated trajectory in this software? For example, I have involved a Structure-from-Motion software, and they used a rule that the translation vector between the first and second camera is a unit vector.

ymd-stella commented 2 years ago

See https://github.com/stella-cv/stella_vslam/blob/f5cbbf55bccfdde63ddfa0d0243206ba7743ce6b/src/stella_vslam/module/initializer.cc#L267-L277

mamo3gr commented 2 years ago

scale the map so that the median of depths is 1.0

Thank you again and sorry for my laziness!

ymd-stella commented 2 years ago

When the coordinates of the reference keyframe change, the position of the marker should change, but it does not. This is a bug. I will fix it soon. -> https://github.com/stella-cv/stella_vslam/pull/311

Reverted due to incorrect update method; multiple updates were made to a marker. Additionally, if the marker is fixed during optimization, their positions should not be updated after optimization. (In particular, updating after local BA makes the mapping unstable.)

During loop closing, the marker position needs to be updated.

The plan is as follows:

AdonaiVera commented 1 year ago

Hi @ymd-stella I want to ask you about getting the marker in the world coordinate. I'm working on Python using the msg file. I use an algorithm to detect the markers, and now I have the X, and Y, and I need to project these points to the world coordinate. My first step is to go from point to bearing.

Vec3_t equirectangular::convert_point_to_bearing(const cv::Point2f& undist_pt) const { // "From Google Street View to 3D City Models (ICCVW 2009)." // convert to unit polar coordinates const double lon = (undist_pt.x / cols_ - 0.5) * (2.0 * M_PI); const double lat = -(undist_pt.y / rows_ - 0.5) * M_PI; // convert to equirectangular coordinates return Vec3_t{std::cos(lat) * std::sin(lon), -std::sin(lat), std::cos(lat) * std::cos(lon)}; }

And then I'm unsure how to go from bearing to the world coordinate. I used ten 3d points of the same images; I made a projection using the bearing and the 3d point in world coordinates. ( Rigidly (+scale) aligns two point clouds with known point-to-point correspondences with least-squares error. ) I get a transformation and rotation matrix; then, I transform my marker points.

However, the results are inaccurate; how can I go from bearing to the world coordinate if I'm in Python? I have the trans_wc rot_wc, trans_cw and rot_cw

Thank you in Advance !

AmineDh98 commented 1 year ago

Hello @ymd-stella, I noticed that the marker integration was reverted due to some issues with the update method. I would like to know if the main branch of the library currently contains the marker integration and if it is working well. If not, could you please provide some guidance on what needs to be added or changed to align with the UCOSlam algorithm?

ymd-stella commented 1 year ago

@AmineDh98 The pose of marker is not updated from the first detected pose, but it is working. It is used for bundle adjustment and to estimate scale when using a monocular camera.

Please set the MarkerModel parameter to enable detection. https://stella-cv.readthedocs.io/en/latest/parameters.html#markermodel

AmineDh98 commented 1 year ago

Hi @ymd-stella , Thank you for your previous answer. Now I would like to request the addition of markers and their positions to the message file generated by the map_database_io_msgpack module in the stella_vslam library. Currently, the message file contains information about keyframes and landmarks, but I would like to extend it to include markers as well. I suggest the below snippet:

// Inside map_database::to_json function
// Save each marker position as json
std::map<std::string, nlohmann::json> markers;
for (const auto& id_marker : markers_) {
    const auto id = id_marker.first;
    const auto& marker = id_marker.second;
    assert(marker);
    assert(id == marker->id_);
    assert(!marker->will_be_erased());
    assert(!markers.count(std::to_string(id)));
    markers[std::to_string(id)] = marker->to_json();
}
json_markers = markers;

// Inside map_database::from_json function
// Parse and load marker data from JSON
const auto json_markers = json.at("markers");
for (const auto& json_marker : json_markers) {
    const auto id = std::stoi(json_marker.at("id").get<std::string>());
    // Parse and create marker object from JSON
    // ...
    markers_[id] = std::make_shared<Marker>(/* marker parameters */);
}

To incorporate markers into the existing message file format, I propose adding a new JSON object or array within the existing JSON structure. The map_database::to_json function can iterate over the markers and store their positions in this new JSON object or array, similar to how keyframes and landmarks are currently handled.

For the map_database::from_json function, you would need to update the parsing logic to correctly extract and create marker objects from the JSON representation.

I believe this addition would enhance the functionality of the stella_vslam library and provide valuable information about markers' positions in the VSLAM frame, enabling further analysis and visualization.

Another suggestion could be, as I am using ROS. The Aruco markers can be also published using a ROS topic! Do you have any hint ?

I would greatly appreciate any guidance or suggestions on how to implement this feature effectively and maintain compatibility with the existing codebase.

ymd-stella commented 1 year ago

No particular problem is found from the snippet. Pull requests are welcome. Saved maps will no longer be compatible, but I think it is sufficient to create conversion scripts if needed.

Another suggestion could be, as I am using ROS. The Aruco markers can be also published using a ROS topic! Do you have any hint ?

I had never thought about it. Is there any benefit to addressing it? I do not intend to work on it, as I am not currently inconvenienced.

AmineDh98 commented 1 year ago

Hi @ymd-stella ,

When I used the marker integration the result in the socket viewer visualizer is bad and the problem seems to be the scale (the keyframes are too far from each other and the camera is moving very fast) I assume that I am not giving the correct width of the marker. My real marker is 4 meters width I have in the yml file as below: ''' MarkerModel: type: "aruco" width: 4 marker_size: 4 max_markers: 50 '''

so what should I write in the yml file ? What is the physical unit used to read the yml file ? Is it in meter, cm, mm ... ? Thank you

ymd-stella commented 1 year ago

This library is not dependent on a specific system of units. As long as the settings do not contradict each other, there is no problem. However, the visualizers are tuned in meter, so it is recommended to use meter.

aschnerring commented 1 year ago

Hello @ymd-stella,

I'm currently working with @AmineDh98 on the marker integration. We are interested in getting the marker position in high precision in order to transform the point cloud and camera trajectory to a coordinate system in which the marker positions are known.

In the following I would like to describe the results of the discussions I had with @AmineDh98 and ask you (or someone else who is familiar with the code) to confirm if this is valid. Note that the hints I provide are not intended for @ymd-stella or other developers familiar to the code as I'm sure you know all of this already as you coded it. It's intended to be sure that we understood correctly and to enable other users to understand better the code involved in this issue.

Approach as intended by @ymd-stella

Aruco marker detection is already implemented in stella_vslam. However, there seem to be some false assumptions and/or bugs in the implementation.

Current appraoch to computing the marker positions in the world coordinate system

In the current implementation, the marker positions in the world frame are computed in the following way:

  1. Set the marker corner positions in the marker coordinate system (stella_vslam/src/stella_vslam/marker_model/base.cc)
  2. Estimate the relative coordinates between camera and marker (stella_vslam/src/stella_vslam/marker_detector/base.cc). This involves:
    1. Detection of 2d markers in the image
    2. Estimate marker pose w.r.t. camera (i.e. rot_cm and trans_cm) using a PNP solver
  3. Compute marker position in world coordinate system (WorldCS<-CameraCS<-MarkerCS) using compute_corners_pos_w in stella_vslam/src/stella_vslam/data/marker2d.cc. This is used during
    1. Initialization: stella_vslam/src/stella_vslam/module/keyframe_inserter.cc
    2. Whenever a new keyframe is inserted: stella_vslam/src/stella_vslam/module/keyframe_inserter.cc in the method create_new_keyframe()

The idea (which is not yet implemented) is then to store exactly one reference keyframe for each marker (i.e. the first keyframe by which this marker was seen by). Then the marker position in the WorldCS is updated during global bundle adjustment whenever this reference keyframe is changed. This approach allows to determine the correct scale of the map, as the marker dimension is set in its own CS and only transformed. Then, the scale information is employed implicitly during local bundle adjustment, where the landmarks and keyframes are adjusted in such a way that the marker corner positions stay fixed in the WorldCS.

Problems with this approach

Noisy initial marker pose estimation

However, a problem of this approach is that it relies on the initial PNP estimation of the marker pose w.r.t. the camera, which is never changed after it is computed upon the first marker detection. This pose estimation is potentially noisy and even if the reference keyframe poses are correct, the marker positions in the WorldCS might still be wrong as their computation assume a noisy initial PNP estimate.

Fixing several markers during local bundle adjustment (not sure about this)

Another potential problem with this approach is the presence of several markers:

Note that all marker positions are fixed during the local bundle adjustment. See the create_vertex method for the marker vertex called in stella_vslam/src/stella_vslam/optimize/local_bundle_adjuster_g2o.cc with is_constant=true to confirm this. Also note that is_constant=true is used in stella_vslam/src/stella_vslam/optimize/internal/marker_vertx_container.h, where it is provided to the vertex, setting it as fixed.

Given that all initial estimates for the markers in world coordinates are potentially noisy (due to the aforementioned PNP solver and the issue that the marker->camera transform is never adjusted), this might result in inconsistent/noisy map information during the local bundle adjustment. Even when the marker positions are updated during a global bundle adjustment, the noise of the initial PNP marker pose estimation will still be present and contribute to a suboptimal local bundle adjustment.

Suggested alternative approach

We instead suggest to include the markers in the local bundle adjustment and treat them just like any other landmark (except that we can identify reference points in the VSLAM point cloud which allow us to transform it and the camera trajectory ). Note that the approach of including the marker corners in the local bundle adjustment is also the approach that was chosen in UcoSlam (except that in UcoSlam, more weight is given to the marker positions, depending on the number of valid markers).

This comes with two advantages:

It also comes with two disadvantages:

We accept these disadvantages for the following reason:

For us, the markers are a tool to transform the camera trajectory and point cloud in the VSLAM coordinate system to a coordinate system in which the marker position is exactly known. We compute this Sim3 transform using the Kabsch algorithm and can hence tolerate inconsistent marker corner positions as this algorithm aims to minimize root mean squared deviation of reference points and should be robust against a few outliers (at least when Gaussian distributed).

As the Kabsch algorithm provides the scale, we also don't depend on the map to be precisely scaled.

Where we need help

We would like to implement the approach described above (we are already working on it with promising intermediate results).

However, we want to make sure that we understood correctly the current approach, if the problems we identified are indeed problems or if maybe we overlooked a detail.

We can report that the current approach leads to worse results than our approach to include the marker corners in the local bundle adjustment. We believe that this is due to the points described above but of course it could be for different reasons. We can also provide the data that led to our assumptions (it was generated in Unreal Engine, using AirSim to control a drone in a virtual environment with known marker positions).

ymd-stella commented 1 year ago

It looks fine to me.

I tried to give a constraint on the shape, but gave up due to the difficulty of weighting. I do not have enough time to spend on this feature.

Also about updating the marker poses.

Shitikantha-Bagh commented 1 month ago

Understood @AmineDh98 and @aschnerring ' s suggested alternative approach with their acceptance of the disadvantages of their approach

"We accept these disadvantages for the following reason:

For us, the markers are a tool to transform the camera trajectory and point cloud in the VSLAM coordinate system to a coordinate system in which the marker position is exactly known. We compute this Sim3 transform using the Kabsch algorithm and can hence tolerate inconsistent marker corner positions as this algorithm aims to minimize root mean squared deviation of reference points and should be robust against a few outliers (at least when Gaussian distributed).

As the Kabsch algorithm provides the scale, we also don't depend on the map to be precisely scaled.

But I think global BA is also an important step as compared to local BA as far as accuracy is concerned, because when the loop closure happens the landmark points are corrected and not the marker poses, then it is possible to get a huge error which I am experiencing in my use case. I feel markers should be included in global BA, and for much more accuracy it should be included in both local and global BA. What are your thoughts ? @ymd-stella @AmineDh98 @aschnerring

Shitikantha-Bagh commented 3 weeks ago

I am sorry for the above confusing message, what I meant is in the present approach does not correct the marker poses after loop closure. @ymd-stella Can you give any idea regarding the weighting of markers ?

Currently I am using this in global BA

// Connect marker vertices
    for (unsigned int marker_idx = 0; marker_idx < markers.size(); ++marker_idx) {
        auto mkr = markers.at(marker_idx);
        if (!mkr) {
            continue;
        }

        // Convert the corners to the g2o vertex, then set it to the optimizer
        auto corner_vertices = marker_vtx_container.create_vertices(mkr, true);
        for (unsigned int corner_idx = 0; corner_idx < corner_vertices.size(); ++corner_idx) {
            const auto corner_vtx = corner_vertices[corner_idx];
            optimizer.addVertex(corner_vtx);

            for (const auto& keyfrm : mkr->observations_) {
                if (!keyfrm) {
                    continue;
                }
                if (keyfrm->will_be_erased()) {
                    continue;
                }
                if (!keyfrm_vtx_container.contain(keyfrm)) {
                    continue;
                }
                const auto keyfrm_vtx = keyfrm_vtx_container.get_vertex(keyfrm);
                const auto& mkr_2d = keyfrm->markers_2d_.at(mkr->id_);
                const auto& undist_pt = mkr_2d.undist_corners_.at(corner_idx);
                const float x_right = -1.0;
                const float inv_sigma_sq = 1/(mkr->get_width() * mkr->get_width());
                const auto sqrt_chi_sq = (keyfrm->camera_->setup_type_ == camera::setup_type_t::Monocular)
                                    ? sqrt_chi_sq_2D
                                    : sqrt_chi_sq_3D;
                auto reproj_edge_wrap = reproj_edge_wrapper(keyfrm, keyfrm_vtx, nullptr, corner_vtx,
                                                        corner_idx, undist_pt.x, undist_pt.y, x_right,
                                                        inv_sigma_sq, sqrt_chi_sq, use_huber_kernel);
                reproj_edge_wraps.push_back(reproj_edge_wrap);
                optimizer.addEdge(reproj_edge_wrap.edge_);
            }
        }
    }

Data/marker.cc

 #include "stella_vslam/data/marker.h"
#include "stella_vslam/data/keyframe.h"
#include "stella_vslam/marker_model/base.h"
#include <nlohmann/json.hpp>

namespace stella_vslam {
namespace data {

marker::marker(const eigen_alloc_vector<Vec3_t>& corners_pos_w, unsigned int id, const std::shared_ptr<marker_model::base>& marker_model)
    : corners_pos_w_(corners_pos_w), id_(id), marker_model_(marker_model) {}

void marker::set_corner_pos(const eigen_alloc_vector<Vec3_t>& corner_pos_w) {
    std::lock_guard<std::mutex> lock(mtx_position_);
    corners_pos_w_ = corner_pos_w;
}
nlohmann::json marker::to_json() const {
    nlohmann::json corners_json = nlohmann::json::array();
    for (const auto&corner : corners_pos_w_){
        corners_json.push_back({corner.x(),corner.y(), corner.z()});
    }
    nlohmann::json marker_json = {{"marker_id",id_},
            {"corners_pos_w_",corners_json},
            {"marker_model_",marker_model_->to_json()}
            };

    auto ref_keyfrm = reference_keyframe_.lock();
    if (ref_keyfrm) {
        marker_json["reference_keyframe_id"] = ref_keyfrm->id_;
    } else {
        marker_json["reference_keyframe_id"] = nullptr;
    }
    return marker_json;
}
void marker::set_reference_keyframe(const std::shared_ptr<keyframe>& ref_keyfrm){
    reference_keyframe_ = ref_keyfrm;

}

std::shared_ptr<keyframe>marker::get_reference_keyframe() const {
    return reference_keyframe_.lock();
}
eigen_alloc_vector<Vec3_t>marker::get_corners_pos() const {
    std::lock_guard<std::mutex> lock(mtx_position_);
    return corners_pos_w_;
}
bool marker::will_be_erased() const{
    std::lock_guard<std::mutex> lock(mtx_position_);
    return will_be_erased_;
}
double marker::get_width() const {
    return marker_model_->width_;
}
} // namespace data
} // namespace stella_vslam

I am storing a reference keyframe for the marker in the keyframe_inserter.cc, but I am not sure how can I use the reference keyframe in global BA. Additionaly, I am storing the marker data to the messagepack.

Shitikantha-Bagh commented 1 week ago

@ymd-stella , if you have any best way to give constraint to the shape by weighting, request you to please integrate it. I have been trying this with different weighting schemes as well but the results are always different for different videos.