mlpc-ucsd / CoaT

(ICCV 2021 Oral) CoaT: Co-Scale Conv-Attentional Image Transformers
Apache License 2.0
227 stars 30 forks source link

Segmentation architecture #11

Open abhigoku10 opened 2 years ago

abhigoku10 commented 2 years ago

@yix081 @xwjabc thanks for sharing the code base i have following queries

  1. Can we convert this architecture to perform segmentation task ie semanttic segmentation ? is so hwo to do it
  2. Can we convert this architecture to perform object detection ?

Please share ur thoughts Thanks in advance

xwjabc commented 2 years ago

Hi @abhigoku10, thank you for your questions!

  1. You may follow a similar way as the one in Swin-Transformer-Semantic-Segmentation.
  2. The current codebase already includes the task on object detection with Deformable DETR. If you would like to deploy the CoaT in Faster R-CNN, you can modify the instance segmentation task in this codebase and only keep the object detection branch. Since the instance segmentation implementation in this repo is based on mmdetection, you may refer to the official docs in mmdetection's repo for more details.
youngwanLEE commented 2 years ago

@abhigoku10
I have implemented semantic segmentation with MPViT based on CoaT. Refer to MPViT/semantic_segmentation.

@xwjabc Thanks to your wonder CoaT, MPViT has been accepted in CVPR 20202. I really appreciate your work and code.

yix081 commented 2 years ago

@youngwanLEE That's cool! Big congrats!

youngwanLEE commented 2 years ago

@yix081 Thank you so much !! I think CoaT is a very cool module !!

xwjabc commented 2 years ago

@youngwanLEE Big congrats and nice work! I would also like to show my apology that I have delayed much when releasing some part of the CoaT code.