Online/realtime incremental point clound densification

danielj-genesis commented 1 year ago

I am trying to densify a sparse point cloud produced from ORB-SLAM2/3 online, or in similar time to the sparse point cloud generation. Something like this would be the end goal: https://www.youtube.com/watch?v=349oLODVVtI

What I'm looking for is a way to constantly update a growing dense point cloud by feeding in the growing point cloud, camera poses, and frames generated by ORB-SLAM. I'm not certain if this is already possible but the pipeline seems designed for SFM output.

I've seen ORB-SLAM+REMODE done and solutions like DeepFactors but OpenMVS would be better for general use case if this functionality was possible. SFM solutions like Colmap take too long to generate a point cloud and meshing is not important to me at this point.

Another example of what I'm trying to achieve: https://www.youtube.com/watch?v=QTKd5UWCG0Q

cdcseacave commented 1 year ago

It is definitely possible, especially if you are not constrained to real time processing. You could use OpenMVS functionality for that directly, but you need to create an executable that is customized to have the interface and processing pipeline you need (for ex waiting for new cameras to come in and process them).

danielj-genesis commented 1 year ago

It is definitely possible, especially if you are not constrained to real time processing. You could use OpenMVS functionality for that directly, but you need to create an executable that is customized to have the interface and processing pipeline you need (for ex waiting for new cameras to come in and process them).

Hi, thanks for the reply. I am trying to achieve close to real time processing. If the densification lags a few seconds behind or only rebuilds the point cloud at 5fps or something that's fine but the idea is to run on live video on a pretty powerful computer. I'm not actually sure if OpenMVS is the best tool for this but if all that's required is the camera pose, sparse point cloud, and keyframes then all that is produced live by ORB-SLAM?

When you mention an executable do you just mean converting the ORB-SLAM pose/cloud/frames into the .mvs format and call the densification pipeline: DensifyPointCloud scene.mvs constantly or would that be far too slow? I'm processing the frames one by one and at any given time have the updated poses/cloud/frames so what's would I have to do to integrate that with OpenMVS densification?

From what I'm understanding the big gap is the depth estimation and OpenMVS seems to handle that so it should be a good solution to the dense monocular slam problem?

cdcseacave commented 1 year ago

You can use the functionality in MVS library contained in OpenMVS to customize as you want the reconstruction. It will be easy to start a new depth-map estimation as a new camera arrives. It will not be that easy to update the mesh only with the new depth-map, for that the algo needs to be adapted.

danielj-genesis commented 1 year ago

You can use the functionality in MVS library contained in OpenMVS to customize as you want the reconstruction. It will be easy to start a new depth-map estimation as a new camera arrives. It will not be that easy to update the mesh only with the new depth-map, for that the algo needs to be adapted.

Okay so the current depth map estimation would take at least like a minute even on a fast computer. I can't think of any other good project that can densify a sparse point cloud from Monocular SLAM? Other approaches seem to use a model to estimate the depth map and I'm not sure they generalize well enough. If there's no better way to get a denser point cloud live(or some other kind of 3d visualization) is there any plans to add this kind of feature of online densification or construction from a camera/cloud/frame feed? I would try forking my own version it but might be too challenging especially vs something like a depth inference model instead of the Patch Match used.

cdcseacave commented 1 year ago

It can take even more than that, but as well much less too. I all depends on the quality/speed tradeoff. You can get a depth-map estimates in only 100ms if you want.

danielj-genesis commented 1 year ago

It can take even more than that, but as well much less too. I all depends on the quality/speed tradeoff. You can get a depth-map estimates in only 100ms if you want. Okay, thank you that sound promising since speed is the priority.

Densifying the point cloud in 100ms is more than good enough since I just want show a rough visual of a scene as it's being explored. I think around 500ms is okay on a NVidia A5000. I'm guessing the max-resolution and the geometric-iters are the more expensive operations?

And since its continuous runs the majority of the depth maps are already computed in previous calls they can be stored as dmap files so the depth map computation should be pretty quick, especially since I only want the dense point cloud and not the mesh(assuming the dense cloud looks good enough)?

And is it actually meaningful to densify the point cloud without a new camera/frame and only a sparse point cloud update? In my test video it was 5min long but only produced about 70 keyframes/poses but the sparse point cloud is always increasing.

cdcseacave commented 1 year ago

I'm guessing the max-resolution and the geometric-iters are the more expensive operations?

Right

danielj-genesis commented 1 year ago

@cdcseacave Okay so I tried to use ORB-SLAM2 as input for OpenMVS and right away the dense point cloud looks very poor. Could this be bad or too few points from ORB-SLAM2 or could tweaking the input make the difference.

I exported the camera poses and point cloud from ORB-SLAM2 using Monocular setting on the freiburg1_room sequence. Here are the resulting sparse clouds/camera poses compared to VisualSFM from the same keyframes. vs

Here are the densified point clouds on default OpenMVS settings: vs

What stands out to me is that the camera trajectory from ORB-SLAM2 looks more accurate to the video sequence but the points are more sparse. I'm not understanding why the VisualSFM point cloud gives such a better result? Visually it doesn't seem that much worse?

Unless I'm exporting SLAM output incorrectly I can't think of anything besides the sparse cloud from SLAM being much worse. Do you know what I could do to achieve a better dense cloud? I've tried Meshroom on the same keyframes and the result is very good but it's incredibly slow

cdcseacave commented 1 year ago

you are comparing probably oranges with apples: SLAM is a real time reconstruction pipeline that need a video as input (very dense frames), while SfM needs sparse frames as input (wide baseline)

danielj-genesis commented 1 year ago

you are comparing probably oranges with apples: SLAM is a real time reconstruction pipeline that need a video as input (very dense frames), while SfM needs sparse frames as input (wide baseline)

Both still produce a sparse point cloud as an output, and I'm running on video and want reconstruction live. I've been testing more and actually gotten much better results by tweaking SLAM. OpenMVS is actually doing a decent job as long as the sparse cloud is pretty good.

I do notice that in spots where there sparse cloud is really good that the OpenMVS dense cloud looks perfect. But there's lots of smaller gaps. Is there any sort of setting or functionality in openMVS for filling larger uniform gaps when densifying or does that have to be done on the sparse cloud before interfacing with OpenMVS.

For example with the image above is there any way to have OpenMVS fill these odd gaps with a uniform point cloud?

I'm aware that the meshing resolves this but I wanted it done in the dense cloud if possible?

cdcseacave commented 1 year ago

the gaps, both in sparse and dense point cloud, are due to textureless areas or lack of good image overlap; a way to improve this is higher quality images (sharp, large resolution, good lighting, etc) to get more detials even on textureless surfaces, and better overlap

ddkats commented 1 year ago

hi @danielj-genesis,

A bit late to the party, but can you explain how you visualize the ORB-SLAM output to meshlab? I managed to extract the camera poses along with points of the scene from ORB-SLAM3 in a txt file, but I struggle to represent it in 3D format. The format is like this ID, fx, fy, cx, cy, qw, qx, qy, qz, tx, ty, tz, imageID and ... MapPointWorldPos(x), MapPointWorldPos(y), MapPointWorldPos(z), observations

cdcseacave commented 1 year ago

if you finish the MVS interface, you can visualize it with the internal Viewer in OpenMVS, and can also export it in PLY or OBJ

ddkats commented 1 year ago

if you finish the MVS interface, you can visualize it with the internal Viewer in OpenMVS, and can also export it in PLY or OBJ

Could you please offer some help with the MVS interface? I can attach the code for reference.

cdcseacave commented 1 year ago

Yes. But did you open Interface.h file? It is only one structure that needs to be filled, and that is all.

On Thu, 20 Jul 2023 at 13:04 DimitrisKatsatos @.***> wrote:

if you finish the MVS interface, you can visualize it with the internal Viewer in OpenMVS, and can also export it in PLY or OBJ

Could you please offer some help with the MVS interface? I can attach the code for reference.

— Reply to this email directly, view it on GitHub https://github.com/cdcseacave/openMVS/issues/945#issuecomment-1643639269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVMH3XQHQBWYQWSZYSV6ATXRD7C3ANCNFSM6AAAAAAVHCIZY4 . You are receiving this because you were mentioned.Message ID: @.***>

ddkats commented 1 year ago

Yes. But did you open Interface.h file? It is only one structure that needs to be filled, and that is all. … On Thu, 20 Jul 2023 at 13:04 DimitrisKatsatos @.> wrote: if you finish the MVS interface, you can visualize it with the internal Viewer in OpenMVS, and can also export it in PLY or OBJ Could you please offer some help with the MVS interface? I can attach the code for reference. — Reply to this email directly, view it on GitHub <#945 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVMH3XQHQBWYQWSZYSV6ATXRD7C3ANCNFSM6AAAAAAVHCIZY4 . You are receiving this because you were mentioned.Message ID: @.>

Thanks, I have sent you an email.

cdcseacave commented 1 year ago

I got your email. Looks fine, not sure what is what you want me to do. I can not compile as there a missing functions, however the approach should work.

Two suggestions though:

you can use Interface.h file only, with no other includes (you are populating MVS::Scene instead of MVS::Interface)
do not create an intermediate representation, direlctly store the read info from disc to the Interface structure

cdcseacave commented 1 year ago

Here is an example interface.zip

cdcseacave / openMVS

Online/realtime incremental point clound densification #945