[New Models] Accuracy of ImageNet pretraining goes to 0 when changing the network

Model/Dataset/Scheduler description

Hi there, thank you for your exceptional work! I'm trying to reproduce your results and to improve them. In particular, I've tried to make a new network, with the structure of the S and double the embedding dims, but after a few epochs the accuracy goes to zero.

Since there might be multiple factors, I'd like to have a chat to clearify which direction I should take to make the network bigger.

Thank you in advance, Giovanni

Open source status

[ ] The model implementation is available
[ ] The model weights are available.

Provide useful links for the implementation

No response

zcablii / LSKNet

[New Models] Accuracy of ImageNet pretraining goes to 0 when changing the network #25

Model/Dataset/Scheduler description

Open source status

Provide useful links for the implementation