huawei-noah / noah-research

Noah Research
871 stars 161 forks source link

[TokenFusion] What is the meaning of x[0] and x[1] in TokenFusion segmentation? #162

Closed CaffreyR closed 1 year ago

CaffreyR commented 1 year ago

Hi, many thanks for your work. When I try to reproduce your code. In your forward there are 4 stages, in each stage, you use this code

x, H, W = self.patch_embed4(x)
for i, blk in enumerate(self.block4):
    score = self.score_predictor[3](x)
    mask = [F.softmax(score_.reshape(B, -1, 2), dim=2)[:, :, 0] for score_ in score]  # mask_: [B, N]
    masks.append(mask)
    x = blk(x, H, W, mask)
x = self.norm4(x)
x = [x_.reshape(B, H, W, -1).permute(0, 3, 1, 2).contiguous() for x_ in x]
outs0.append(x[0])
outs1.append(x[1])

Do x[0] and x[1] refer to RGB and Depth input? Then when does tokenfusion take place?

Many thanks!

xinghaochen commented 1 year ago