Questions about training supernet

mit-han-lab / once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

https://ofa.mit.edu/

MIT License

1.89k stars 333 forks source link

Open zhiheng-ldj opened 3 years ago

zhiheng-ldj commented 3 years ago

Hi,

Thanks for your time regarding to this issue.

I have some questions about OFA supernet training phase.

Will performance of supernet always surpass the performance of original model?
How should we modify the hyper parameter setting from original model task (LR, optimizer type)?
Is the performance of supernet the ceil of performances of subnets?

Thanks for your help and happy Chinese New Year!