Thanks for your creative and valuable work. When i'm retraining your coco model, i find a difference between paper and code for training.
According to appendix-A-Training Detail, for the implementation of classifier-free guidance, caption and grounding tokens will be dropped with 10% probability. ("We randomly drop caption and grounding tokens with 10% probability for classifier-free guidance.") But in the released code, the probability of dropping caption is 50%. I'd like to know which one is better for the model. Thanks.
references:
probability of grounding dropout = 0.1
ldm/modules/diffusionmodules/openaimodel.py: line 428-429
probability of caption dropout = 0.5
tsv_dataset.py: line 306
configs/GoldG+SBU+CC3M+O365_box_text.yaml:line 65,72,79,86,93.
Thanks for your creative and valuable work. When i'm retraining your coco model, i find a difference between paper and code for training.
According to appendix-A-Training Detail, for the implementation of classifier-free guidance, caption and grounding tokens will be dropped with 10% probability. ("We randomly drop caption and grounding tokens with 10% probability for classifier-free guidance.") But in the released code, the probability of dropping caption is 50%. I'd like to know which one is better for the model. Thanks.
references:
probability of grounding dropout = 0.1 ldm/modules/diffusionmodules/openaimodel.py: line 428-429
probability of caption dropout = 0.5 tsv_dataset.py: line 306 configs/GoldG+SBU+CC3M+O365_box_text.yaml:line 65,72,79,86,93.