Open jameslahm opened 3 months ago
Hi, OMG, you are right. I have checked the code and checkpoint again. use_layer_scale is not passed in the model. It means that all your hyper-parameters are as the same as mine, but there is a 0.3% accuracy gap. Can I ask the global batch size when you run the code? My batch size is 64x16. The learning rate is scaled according to the global batch size. Different batch size may cause the bad influence for the result.
Thanks for your reply! My batch size is 128*8, according to the default training command.
python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 main.py \
--cfg configs/swin_small_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 128
I just now updates the log files here https://github.com/changsn/STViT-R/tree/main/log. Can they give you more helps?
Thanks a lot! I will check the difference between the log files. BTW, would you mind giving me some guidance about the semantic segmentation task in #7? I'd appreciate it very much.
Thanks for your great work! I just noticed that
USE_LAYER_SCALE
is actually not used during training, as shown in https://github.com/changsn/STViT-R/blob/d1532e8b74a72c714669bc7201e7fee2089718c4/models/build.py#L15-L35 Therefore, the provided config for Swin-Small by default in #5 is the same as yours. Is the lineuse_layer_scale=config.USE_LAYER_SCALE
missing here? Thanks.