chinhsuanwu / coatnet-pytorch

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes"
https://arxiv.org/abs/2106.04803
MIT License
370 stars 67 forks source link

Models seem not converging. #9

Open bsun0802 opened 2 years ago

bsun0802 commented 2 years ago

Hi,

I tried to train CoAtNet_0 with tiny image net from cs231n (200 classes). Seems the model does not converge.

Could it be that the implementation is not 100% correct? For example, the positional embedding indexing part. I went through the code and I think other components should be correct.

Except for the pos embedding indexing, I'm not good enough to comprehend it. Do you have a reference for the implementation of the positional embedding indexing part?

kinter4 commented 2 years ago

I faced the same issue, model is not converging no matter how I change hyper parameters.