OliverRensu / co-advise

MIT License
9 stars 1 forks source link

using the embed_conv instead of patch_embed #1

Closed zhou-rui1 closed 2 years ago

zhou-rui1 commented 2 years ago

https://github.com/OliverRensu/co-advise/blob/8049918d059caa90a69ce5791d906117ea2af453/models.py#L54 Hi, it seems that if the embed_conv is used, the ViT model archiecture will change, other than just ditilling the cls tokens, as presented in the paper?

OliverRensu commented 2 years ago

Hi, Table 3 reports 2 kinds of settings: 1) CiT-S keeps the same architecture and achieves 82.0% 2) CiT-SAK has token inductive bias Alignment and achieves 82.7%. You can replace the embed_conv by non-overlap patch embedding for CiT-S.

rachel @.***> 于2022年11月12日周六 23:22写道:

https://github.com/OliverRensu/co-advise/blob/8049918d059caa90a69ce5791d906117ea2af453/models.py#L54 Hi, it seems that if the embed_conv is used, the ViT model archiecture will change, other than the ditillating tokens, as presented in the paper?

— Reply to this email directly, view it on GitHub https://github.com/OliverRensu/co-advise/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANP7CCAF5MW227BKEQPQPYLWH6Y2TANCNFSM6AAAAAAR6LV4CE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zhou-rui1 commented 2 years ago

Thanks for reply.