Closed CaffreyR closed 1 year ago
x[0] and x[1] refer to the output of RGB and Depth stream.
The TokenFusion operation takes place in https://github.com/huawei-noah/noah-research/blob/master/TokenFusion/semantic_segmentation/models/mix_transformer.py#L122
Hi, many thanks for your work. When I try to reproduce your code. In your forward there are 4 stages, in each stage, you use this code
Do x[0] and x[1] refer to RGB and Depth input? Then when does tokenfusion take place?
Many thanks!