yingkaisha / keras-vision-transformer

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET
MIT License
116 stars 40 forks source link

swin transformer depth #2

Open atgc1984 opened 2 years ago

atgc1984 commented 2 years ago

Hello, Thank you for the example on MNIST classification. The demo seems to have only one downsample stage and it is enough for this case. How can I config a larger model to have more downsample stages? Shall I just add more patch_extract/patch_embedding and block loop with smaller patch size?

Thank you.

STU-ECHO commented 2 years ago

Hello, Thank you for the example on MNIST classification. The demo seems to have only one downsample stage and it is enough for this case. How can I config a larger model to have more downsample stages? Shall I just add more patch_extract/patch_embedding and block loop with smaller patch size?

Thank you.

hi have you solve the problem you proposed? I got the same question as you ,could you give me some advices