med-air / EndoNeRF

Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in Robotic Surgery
https://med-air.github.io/EndoNeRF/
173 stars 14 forks source link

Masks used during the training and inference #28

Closed yifliu3 closed 6 months ago

yifliu3 commented 6 months ago

Dear authors,

I notice that you use "masks" during the training while using "gt_masks" in the README.md file to evaluate the performance? Can you explain the reason to use different masks or it is simply a typo. I found using inconsistent mask types would induce performance drop with 10db in PSNR.

yuehaowang commented 6 months ago

Thanks for your good question.

The "masks" is a merge of occlusion masks and "gt_masks". During training, since depth estimation fails on those occlusion areas, fitting those pixels with unreliable depths will cause artifacts in geometry. So I use "masks" for training. As there are ground truth pixels for the occluded areas, I directly use the "gt_masks" for evaluation.

Regarding the 10db performance drop, I think it mainly comes from the prediction of occlusion areas.

yifliu3 commented 6 months ago

Got it. Thanks a lot for your reply.