Closed zhou-rui1 closed 2 years ago
Hi, Table 3 reports 2 kinds of settings: 1) CiT-S keeps the same architecture and achieves 82.0% 2) CiT-SAK has token inductive bias Alignment and achieves 82.7%. You can replace the embed_conv by non-overlap patch embedding for CiT-S.
rachel @.***> 于2022年11月12日周六 23:22写道:
https://github.com/OliverRensu/co-advise/blob/8049918d059caa90a69ce5791d906117ea2af453/models.py#L54 Hi, it seems that if the embed_conv is used, the ViT model archiecture will change, other than the ditillating tokens, as presented in the paper?
— Reply to this email directly, view it on GitHub https://github.com/OliverRensu/co-advise/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANP7CCAF5MW227BKEQPQPYLWH6Y2TANCNFSM6AAAAAAR6LV4CE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks for reply.
https://github.com/OliverRensu/co-advise/blob/8049918d059caa90a69ce5791d906117ea2af453/models.py#L54 Hi, it seems that if the embed_conv is used, the ViT model archiecture will change, other than just ditilling the cls tokens, as presented in the paper?