keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

Add CoAtNet #1223

Closed IMvision12 closed 1 year ago

IMvision12 commented 1 year ago

Short Description

Contnet is a network which is combination of depthwise Convolution and self-Attention, stacking convolution layers and attention layers vertically. CoAtNet achieves 86.0% ImageNet top-1 accuracy; When pre-trained with 13M images from ImageNet-21K, CoAtNet achieves 88.56% top-1 accuracy,

Accuracy Graph : Capture

Paper

@tanzhenyu Will it be worthwhile to include this model? if yes I Would like to work on this!!!

tanzhenyu commented 1 year ago

@IMvision12 there are a couple of models open for contribution, search for label "contribution welcome", for example, swin transformer. For CoAtNet, we need to verify the impact -- as a backbone, what will this be used for?