microsoft / Cream

This is a collection of our NAS and Vision Transformer work.
MIT License
1.66k stars 225 forks source link

The training detail of AutoFormer #152

Open mxjecho opened 1 year ago

mxjecho commented 1 year ago

Hi, I just tested the acc of the largest model from supernet-T, and I found that the largest model acc is 67.188. it is unreasonable since the acc of the largest model is usually the highest. I trained a supernet according to the script you provided, and the largest model accuracy is 76.3.

I'm confused as to why the supernet you provided has this phenomenon. How to determine if the training has converged?

Thanks