mindspore-lab / mindcv

A toolbox of vision models and algorithms based on MindSpore
https://mindspore-lab.github.io/mindcv/
Apache License 2.0
231 stars 140 forks source link

extend vit and add mae model and finetune checkpoint file #707

Closed sageyou closed 10 months ago

sageyou commented 1 year ago

Thank you for your contribution to the MindCV repo. Before submitting this PR, please make sure:

Motivation

1) rebase pr https://github.com/mindspore-lab/mindcv/pull/733 of SamitHuang 2) update model vit 3) add and update model mae (also with chheckpoint file fo mae finetune) 4) remaining tasks: upload new checkpoint files of vit (after being merged of this pr)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

https://github.com/mindspore-lab/mindcv/pull/733 (Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

vigo999 commented 1 year ago

@SamitHuang @geniuspatrick please review

vigo999 commented 1 year ago

原本的vit 是否可以提供vit encoder? 不单独搞vit_encoder.py @sageyou @geniuspatrick

sageyou commented 1 year ago

原本的vit 是否可以提供vit encoder? 不单独搞vit_encoder.py @sageyou @geniuspatrick

mae可以实现,但是会麻烦 结构都要动。 beit不行,因为新的vit_encoder就是出于mae和beit可以共同使用enocoder而写的,里面增加了新的功能供beit使用, 例如Attention中可以使用relative_position_bias,encoder中可选择使用LayerScale等,具体请看治锋之前提的一个issue:https://github.com/mindspore-lab/mindcv/issues/693。 总的来说新的可以兼容原来的vit encoder, 非要整改的话, 个人建议改原来的vit。 @vigo999 @geniuspatrick

SamitHuang commented 1 year ago

已将ViT Encoder加入到原vit中,请Review并rebase 该PR: https://github.com/mindspore-lab/mindcv/pull/733

sageyou commented 1 year ago

review plz, @SamitHuang @geniuspatrick

geniuspatrick commented 1 year ago

LGTM. BTW, have the new weights been verified?

sageyou commented 1 year ago

LGTM. BTW, have the new weights been verified?

@geniuspatrick Yes, I have validated all of them, including vit and mae. However, the ckpt file of vit has not been uploaded yet: the url of pretrained-vit of this pr is not the latest. I would like to do it next~