Closed yangapku closed 4 years ago
Hi. I have noticed that there are several config files in
cfg/pretrain
,cfg/vqa
andcfg/refcoco
(like there are 3 base-model configsbase_e2e_16x16G_fp16.yaml
,base_prec_4x16G_fp32.yaml
,base_prec_withouttextonly_4x16G_fp32.yaml
existing incfg/pretrain
) Can you provide more details about the differences of these configs? If I want to reproduce the paper results, which configs among them should I use? Thank you!
<MODEL/SETTING>_<NUM_GPUxGPU_MEM>_<fp16/32>
.base_prec_withouttextonly_4x16G_fp32.yaml
is corresponding to setting(c).NETWORK.PARTIAL_PRETRAIN
of config yaml.Thanks!
Hi @jackroos
May I know why TRAIN.GRAD_ACCUMULATE_STEPS
is missing in cfgs/vqa/base_4x16G_fp32.yaml
? Is this default to be 2 (or 4)? Thank you!
@coldmanck Actually, by default, TRAIN.GRAD_ACCUMULATE_STEPS
is set to 1, in cfgs/vqa/base_4x16G_fp32.yaml
, the total batch size is 4*64=256 (large enough), so we don't need to accumulate gradient in this case. Thanks!
Hi @jackroos
Thank you for your response! May I know how batch size works in your code? For example, I understand in 4*64=256
, 64
comes from config.TRAIN.BATCH_IMAGES
but why multiplied by 4
? Also how does it work with TRAIN.GRAD_ACCUMULATE_STEPS
?
I am guessing the final batch size is calculated by GRAD_ACCUMULATE_STEPS * BATCH_IMAGES * (number of GPUs)
? (Very possibly I am wrong)
@coldmanck Yeah! The 'actual' batch size is exactly what you guess.
Thanks a lot!
@jackroos But I find some cfgs does not with actual batch size 256
as you mentioned in the paper, such as in cfgs/vcr/large_q2a_4x16G_fp16.yaml
, BATCH_IMAGES
is 4, GRAD_ACCUMULATE_STEPS
is 4 and assuming 4 GPUs are used, the 'actual' batch size is 64
. Should we modify any of these hyper-parameters to match your batch size of 256 in order to reproduce your result?
@coldmanck In VCR, the batch size is 4x larger than the batch size in the config, since for each question, there are 4 answer candidates.
I see. Thanks again 👍
Hi. I have noticed that there are several config files in
cfg/pretrain
,cfg/vqa
andcfg/refcoco
(like there are 3 base-model configsbase_e2e_16x16G_fp16.yaml
,base_prec_4x16G_fp32.yaml
,base_prec_withouttextonly_4x16G_fp32.yaml
existing incfg/pretrain
) Can you provide more details about the differences of these configs? If I want to reproduce the paper results, which configs among them should I use? Thank you!