facebookresearch / Active-3D-Vision-and-Touch

A repository for the paper Active 3D Shape Reconstruction from Vision and Touch and robotic touch simulator package.
MIT License
24 stars 9 forks source link

Mask embedding computation #3

Open maurock opened 1 year ago

maurock commented 1 year ago

Hi, thanks for your help so far! I have noticed that the mask embedding, which is computed at the first deformation loop https://github.com/facebookresearch/Active-3D-Vision-and-Touch/blob/c10da8f94491d6c9bbe55121dccb2e7faa543cfd/pterotactyl/reconstruction/vision/model.py#L230 is computed again at the third deformation loop: https://github.com/facebookresearch/Active-3D-Vision-and-Touch/blob/c10da8f94491d6c9bbe55121dccb2e7faa543cfd/pterotactyl/reconstruction/vision/model.py#L274 Because the mask embedding is fixed throughout the entire training process, shouldn't we use the same embedding variable computed at the first deformation loop? I am wondering whether computing it twice results in a suboptimal optimisation of the mask embedding network during backpropagation.

I am aware that this is a minor issue, so I am just asking to make sure that I am not missing an important detail. Thanks!

EdwardSmith1884 commented 1 year ago

Yes I believe you are right. I would say best practice would be to set the mask_features to None initially, and then check if it is None before computing a second time. I don't have the cycles right now to test this, but if you want to speed up things on your end I would try this.