Open tileb1 opened 3 years ago
Good question! You may find the args we used for each run in the released full ckpt, by loading the each checkpoint and checking the key args
.
In general, we tuned very little to produce the reported results across different runs, so the hyper-parameter settings are similar in different configurations. For example, one typical hyper-parameter setting is (loading the released checkpoint of EsViT (Swin-T, W=7), and printing the dictionary item args
):
Namespace(arch='swin_tiny', batch_size_per_gpu=32, cfg='experiments/imagenet/swin/swin_tiny_patch4_window7_224.yaml', clip_grad=3.0, data_path='/msrhyper-weka/public/penzhan/oscar/phillytools/data/sasa/imagenet/2012', dist_url='env://', epochs=300, freeze_last_layer=1, global_crops_scale=(0.4, 1.0), gpu=0, local_crops_number=8, local_crops_scale=(0.05, 0.4), local_rank=0, lr=0.0005, min_lr=1e-06, momentum_teacher=0.996, norm_last_layer=False, num_workers=10, optimizer='adamw', opts=[], out_dim=65536, output_dir='/mnt/output_storage/dino_exp/swin//swin_tiny/bl_lr0.0005_gpu16_bs32_dense_multicrop_epoch300', patch_size=16, rank=0, saveckp_freq=20, seed=0, teacher_temp=0.07, use_bn_in_head=False, use_dense_prediction=True, use_fp16=True, warmup_epochs=10, warmup_teacher_temp=0.04, warmup_teacher_temp_epochs=30, weight_decay=0.04, weight_decay_end=0.4, world_size=16, zip_mode=True)
Ah yes, didn't think of loading from the checkpoint... Thanks!
Hi, @ChunyuanLI. I have been trying to download the checkpoint to load the pre-training args, but the download speed was extremely slow and the download often failed halfway. Could you please kindly share the args in separate links?
Hello, Could you please provide the args used for running
main_esvit.py
with the right arguments for each run in the table below (first table in README)? Are the args used different for each entry?Thank you!