For each feature map F_i ∈ {F_g, F_s, F_c} representing the geometric feature, the DINOV2 feature, and the respective RGB value, we employ a separate ResNet model L_i as feature extractor
However, in your code, I see the following instead:
Specifically, from my understanding, this is the input to the 'Dual-stream fusion' step
x = self.spherical_fpn(dis_map, torch.cat([rgb_map, ref_map],dim = 1) , ppf_map) # each input feature passes through a separate ResNet here
where
dis_map : radial distance features in spherical map representation
torch.cat([rgb_map, ref_map],dim = 1) : RGB features and DINOv2 features in spherical map representation
ppf_map : geometric point-pair features in spherical map representation
I have two questions regarding this:
Is the above understanding correct?
How do these inputs correspond to the features described in the paper? Specifically which feature does dis_map correspond to? It looks like we should have rgb_map = F_c and ref_map = F_s and ppf_map = F_g. But in fact rgb_map and ref_map are concatenated and treated as one feature rather than two, and the paper doesn't describe dis_map as an input feature even though it is in the code?
Hello, in your paper you describe the following:
However, in your code, I see the following instead:
https://github.com/NOrangeeroli/SecondPose/blob/89725402284c3478a217bf7c8806985f58aab287/model/VI_Net_geodino.py#L362-L367
Specifically, from my understanding, this is the input to the 'Dual-stream fusion' step
where
dis_map
: radial distance features in spherical map representationtorch.cat([rgb_map, ref_map],dim = 1)
: RGB features and DINOv2 features in spherical map representationppf_map
: geometric point-pair features in spherical map representationI have two questions regarding this:
dis_map
correspond to? It looks like we should havergb_map = F_c
andref_map = F_s
andppf_map = F_g
. But in factrgb_map
andref_map
are concatenated and treated as one feature rather than two, and the paper doesn't describedis_map
as an input feature even though it is in the code?Thanks for your time!