WangYixuan12 / d3fields

[CoRL 24] D^3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Robotic Manipulation
https://robopil.github.io/d3fields/
MIT License
108 stars 6 forks source link

Question about the shape of sematic feature map #5

Closed Gloryseven closed 6 months ago

Gloryseven commented 8 months ago

hello! The size of dinov2 feature image is 'patch_h, patch_w', but the size of mask image is 'H, W'. They are written the same in the interpolation section of the paper. (both 'H ,W'). How is it handled in the code?

WangYixuan12 commented 8 months ago

During the interpolation, a 3D point will be projected into 2D image space and normalized to 0~1. Therefore, it does not matter if H does not equal to patch_h. More details can be seen in https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html