Closed xywlpo closed 1 year ago
Hi @xywlpo ,
Thanks for your interest. We finetune our model for 300 epochs, using AdamW optimizer, batchsize 256, learning rate 1×10−3 and weight-decay 1×10−8 for downstream classification datasets. The remaining hyperparameters are the same as the imagenet setting. Please try it.
Best, Xinyu .
hi, your work is amazing. If I want to finetune on my own datasets, how to set the training parameters, including epoch, lr and so on. Thanks for your help!