youngwanLEE / MPViT

[CVPR 2022] MPViT:Multi-Path Vision Transformer for Dense Prediction
https://arxiv.org/abs/2112.11010
Other
364 stars 40 forks source link

about the channels #7

Open BingyuanW opened 2 years ago

BingyuanW commented 2 years ago

Thanks for the great work !

I wonder why the in_channels of decode_head is [ 224, 368, 480, 480 ] rather than [ 128, 224, 368, 480 ] for the MPViT-Base ?

Looking forward to your reply. Thanks again.

youngwanLEE commented 2 years ago

@BingyuanW Hi

As mentioned in our paper, each stage outputs the feature maps with the number of the next stage embedding channel size.