Closed LTaiQin closed 1 week ago
Hi, thank you so much for your interest in our work! Sorry for the late reply, as I have been busy with some emergent stuff recently.
The different training stages are similar to our baseline framework OVD. obj2txt_stage1 does not use extra class-agnostic proposals for the object-to-image diffusion (similar role as the Region-based Knowledge Distillation in OVD), while obj2img2txt_stage2 uses them. obj2img2txt_final is based on obj2img2txt_stage2 by adding a larger loss weight and a smaller learning rate as a fine-tuning stage for our final model, which makes the novel-class performance more stable in our experiments. Hope this can help, thank you!
Yours Wuyang
Thank you so much for taking the time to respond to my question! Your guidance is extremely helpful, and I appreciate your effort in providing such a clear explanation. Thanks for your support and for sharing your work with the community!
Hello, I'm very interested in your project! However, I'm unsure about the differences between the three training stages in the
train.sh
file. Could you please explain the differences betweenobj2txt_stage1
,obj2img2txt_stage2
, andobj2img2txt_final
? Thank you for your help!