guxinqian / AP3D

Pytorch implementation of "Appearance-Preserving 3D Convolution for Video-based Person Re-identification"
Apache License 2.0
97 stars 24 forks source link

如何理解这部分 #13

Closed kanagi2587 closed 2 years ago

kanagi2587 commented 3 years ago

x_norm_expand = x_norm.unsqueeze(3).expand(-1, -1, -1, N-1, -1, -1).permute(0, 2, 3, 4, 5, 1)

虽然能够理解利用拷贝进行相关矩阵的计算的操作,但是这里的扩维和expand操作是为什么呢,在之后的contrastive_att中也是,因为之前接触其它方向的时候没见过6维Tensor

guxinqian commented 3 years ago

一般2D卷积的输入是NCHW, 3D卷积的输入是NCTHW.这行操作是想把原始输入的T帧都复制两份,所以做了unsqueeze和expand, 对应原始论文里图2中第二列feature map下面那部分。