Closed mihaibujanca closed 3 years ago
Tree search is not very friendly with GPU, and I remember the fusion4d paper reports a similar problem. Their solution is to use thread-level parallelism to run image processing and reconstruction concurrently, which is not implemented in this repo yet.
Perhaps a learned optical flow would be better than GPC? Seems to be a good candidate.
Hi @weigao95 I've profiled surfelwarp and it seems like
surfelwarp::device::buildColliderKeyValueKernel
takes up 40-50% of CUDA time (and most of it seems to be spent on waiting to synchronize). Do you have any suggestions on how this might be improved?On a related note, I've tried increasing
num_trees
inPatchColliderRGBCorrespondence
, but any value above 5 results in a crash inPatchColliderRGBCorrespondence::FindCorrespondence
.Do you think it's worth trying to improve the sparse matching based on GPC or should I look into alternative methods?
Thanks