F2Wang / ObjectDatasetTools

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.
MIT License
409 stars 90 forks source link

A question about 3d reconstruction #31

Closed pyni closed 4 years ago

pyni commented 4 years ago

Hi~ i wonder whether your code has used bundle adjustment to optimize the 3D reconstruction parts?
And why not use something, such as Elesticfution, to reconstruct the whole scene?

pyni commented 4 years ago

And i wonder why it uses aruco.DICT_6X6_250 ?

zhangxiaodi commented 4 years ago

Maybe the mark is the size of 6x6. I want to know what is Elesticfution. Can you provide that link in detail?

pyni commented 4 years ago

ElasticFusion(https://github.com/mp3guy/ElasticFusion.git)

I have read your code carefully, i can find the answer of my problem by myself. This code has loop closure detection.

But i am not sure what the meaning of "250" in "DICT_6X6_250"?

F2Wang commented 4 years ago

Quote from opencv's website for aruco marker detection:

"the Dictionary object is created by choosing one of the predefined dictionaries in the aruco module. Concretely, this dictionary is composed of 250 markers and a marker size of 6x6 bits (DICT_6X6_250)."

From my own experience, ElasticFusion, DynamicFusion, KinectFusion may all suffer from some misalignment issues, when you want to register a small scene (instead of a room scene with an abundant amount of keypoint features and geometry features), although you enjoy the nice acceleration as they run on GPU. Instead, this code uses the recognition of aruco markers to make the alignment process a little more robust.

pyni commented 4 years ago

I see! Thanks a lot!

pyni commented 4 years ago

Hi, i still have a question. In your code:

    print("Optimizing PoseGraph ...")
    option = GlobalOptimizationOption(
            max_correspondence_distance = max_correspondence_distance_fine,
            edge_prune_threshold = 0.25,
            reference_node = 0)
    global_optimization(pose_graph,
            GlobalOptimizationLevenbergMarquardt(),
            GlobalOptimizationConvergenceCriteria(), option)

And graph is defined as : pose_graph.nodes.append(PoseGraphNode(np.linalg.inv(odometry))) pose_graph.edges.append(PoseGraphEdge(source_id, target_id, transformation_icp, information_icp, uncertain = False))

It seems that, the whole optimization process requires no point clouds as input. I can hardly understand it. And it seems that it also requires no aruco points as the input. Could you please explain it? i am a new learner for this area.

F2Wang commented 4 years ago

Hi , you are looking at the pose graph optimization part of the code. This part only optimize the graph when its already built (e.g. optimize nodes values (transforms of the objects in regards to the first frame) that minimize the sum of the edge weights (edge weights are relative transforms between 2 vertices)).

Instead, the graph is built in full_registration.