Closed Xuanmeng-Zhang closed 3 years ago
Hi @Xuanmeng-Zhang,
All the ablation results reported in the Tab. 3 in the main paper are trained with 912 x 228 center-cropped patches for 20 epochs with a batch of 12 images. (Please refer to the Sec. 6.3)
The final validation performance with the full-modal trained on full-data is 771.8 which is reported in this git.
Thank you!
Thank you for your answering! I have some question about the code as follows.
for idx_off in range(0, self.num + 1):
ww = idx_off % self.k_f
hh = idx_off // self.k_f
if ww == (self.k_f - 1) / 2 and hh == (self.k_f - 1) / 2:
continue
offset_tmp = offset_each[idx_off].detach()
offset_tmp[:, 0, :, :] = \
offset_tmp[:, 0, :, :] + hh - (self.k_f - 1) / 2
offset_tmp[:, 1, :, :] = \
offset_tmp[:, 1, :, :] + ww - (self.k_f - 1) / 2
conf_tmp = ModulatedDeformConvFunction.apply(
confidence, offset_tmp, modulation_dummy, self.w_conf,
self.b, self.stride, 0, self.dilation, self.groups,
self.deformable_groups, self.im2col_step)
list_conf.append(conf_tmp)`
I wonder why using offset_tmp to compute the conf_tmp. In other words, can we use variable offset to compute conf_tmp like follow code.
def _propagate_once(self, feat, offset, aff): feat = ModulatedDeformConvFunction.apply( feat, offset, aff, self.w, self.b, self.stride, self.padding, self.dilation, self.groups, self.deformable_groups, self.im2col_step ) return feat
.
Thank you!
The current implementation appends each neighbor's confidence to a list and the final confidence volume for neighbors are constructed by concatenating each neighbor's confidence.
If we directly use ModulatedDeformConvFunction, it sums each neighbor's confidence along the channel and it is difficult to get each neighbor's confidence separately.
I think using group or deformable group will enable more efficient implementation, but at the time of development, I adopted 100% correct implementation although it is slightly inefficient.
Hi, Park. I'm confused about the result in Table 3(m), where RMSE is 884.1mm. I think the setting (m) is the total proposed framework. What is the difference between (m) and the release model in validation dataset(771.8)?