Open nizhezhiwei opened 1 year ago
self.conv.weight[:, :, 0, :, :].sum(2).sum(2) + self.conv.weight[:, :, 2, :, :].sum(2).sum(2) is a processing of the convolution kernel. Since the size of the convolution kernel is 3x3x3 and the channel is OIDHW, where the third channel is Depth which means time. Two sum (2) means accumulating on H and W channels, which equals to taking average in a 3x3 space. Then add up averages from first frame and third frame for each step of convolution operation, then calculate difference with second frame.
According to the original text, the output at the current spatio-temporal position in the feature map (that is, the central position of the convolution kernel) should subtract from adjacent outputs (i.e., previous and next frames). I think this writing should be consistent with the original text.
I reproduced this model and made a fair comparison with other models on public datasets. If interested, check out https://github.com/KegangWangCCNU/PhysBench
Thank you very much for your reply.
I'm having trouble with the TDC module,the paper explains that it uses the TDC module proposed by AUTOHR.However, I don't think CDC_T code implements the formula for TDC in AUTOHR.Why use out_normal-06*out_diff, and out_diff is to first calculate the combination of the weights of the t0 fragment and the t2 fragment in the convolution kernel, and then use the 3D convolution to volume, I am very confused, can you explain,thank you