open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
7.93k stars 2.56k forks source link

Add HRViT #1730

Open lorinczszabolcs opened 2 years ago

lorinczszabolcs commented 2 years ago

Describe the feature

Add the model described in "Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation" which is a new vision transformer backbone design for semantic segmentation. It has a multi-branch high-resolution (HR) architecture with enhanced multi-scale representability, surpassing state-of-the-art MiT and CSWin backbones with an average of +1.78 mIoU improvement, 28% parameter saving, and 21% FLOPs reduction on ADE20K and Cityscapes.

Motivation

Recent model that combines the features of HRNet and ViT, achieving good performance while reducing parameters and FLOPs.

Related resources

Official code can be found here.

Additional context Their implementation already uses mmseg and mmcv, so should be quite straightforward to add support for it.

MengzhangLI commented 2 years ago

Hi, thanks for your issue. It is really a great work, we have noticed it but due to lack of developers, we do not have a clear time schedule to support it.

If you are willing to support it, PR is always welcome and we would review it as soon as possible because PRs from community are high priority for our repo.

Best,

lorinczszabolcs commented 2 years ago

Sure, I will try in the following days. Probably will need some help on the way, but hopefully we will manage to do it :).

Best, Szabi

MengzhangLI commented 2 years ago

OK, feel free to contact us when you meet any problems in your PR.

Best,

lorinczszabolcs commented 2 years ago

Hi,

I have the first version of the implementation, but I am unsure whether it works or how to test it properly, what kind of tests should be written. Should I create a pull request and continue the discussion there? Is it needed to refer to this issue in that pull request somehow? Thanks for all the help, I am looking forward to your feedback.

Best regards, Szabi

xiexinch commented 2 years ago

Hi @lorinczszabolcs, sorry for the late reply. When you have finished your draft version, might create a pull request to this repository and attach this issue, we'll review it ASAP. You might test your code by loading the weights provided by the author, in generally, it might need to convert the keys of weights. Then you might run an evaluation, if the results match the results on paper, that means your code is correct.

lorinczszabolcs commented 2 years ago

Hi @xiexinch !

Ok, I will create a pull request soon.

Unfortunately the authors didn't provide pretrained weights for now. Would you have the resources to train the networks from scratch and evaluate them that way, maybe also providing pretrained weights after that?