weigao95 / surfelwarp

SurfelWarp: Efficient Non-Volumetric Dynamic Reconstruction
https://sites.google.com/view/surfelwarp/home
BSD 3-Clause "New" or "Revised" License
276 stars 71 forks source link

Speeding up sparse feature matching #53

Closed mihaibujanca closed 3 years ago

mihaibujanca commented 3 years ago

Hi @weigao95 I've profiled surfelwarp and it seems like surfelwarp::device::buildColliderKeyValueKernel takes up 40-50% of CUDA time (and most of it seems to be spent on waiting to synchronize). Do you have any suggestions on how this might be improved?

On a related note, I've tried increasing num_trees in PatchColliderRGBCorrespondence, but any value above 5 results in a crash in PatchColliderRGBCorrespondence::FindCorrespondence.

Do you think it's worth trying to improve the sparse matching based on GPC or should I look into alternative methods?

Thanks

weigao95 commented 3 years ago

Tree search is not very friendly with GPU, and I remember the fusion4d paper reports a similar problem. Their solution is to use thread-level parallelism to run image processing and reconstruction concurrently, which is not implemented in this repo yet.

Perhaps a learned optical flow would be better than GPC? Seems to be a good candidate.