haoyi-duan / DG-SCT

NeurIPS'2023 official implementation code
53 stars 4 forks source link

About trainable and frozen parameters in AVQA #5

Closed kaiw7 closed 9 months ago

kaiw7 commented 9 months ago

Hi, I am wondering which parts of parameters in AVQA are trainable and forzen? Whether only the parameters of pre-trained ViTs are frozen and others including your proposed prompts, ground modules, and text encoder are trainable? Many thanks.

haoyi-duan commented 9 months ago

Pretrained ViTs and text encoders are frozen, others, such as DG-SCT modules, models of downstream tasks, like grouding modules of AVQA task as you mentioned, are trainable.

kaiw7 commented 9 months ago

Pretrained ViTs and text encoders are frozen, others, such as DG-SCT modules, models of downstream tasks, like grouding modules of AVQA task as you mentioned, are trainable.

Thank you very much for your reply. For AVQA task, some modules like grounding modules, question encoder, etc. Whether all modules of downstream AVQA are trainable, where only grounding modules are initialized with pre-trained weights and others are train from scratch? Because I noticed the text encoder (for question) is frozen in the figure 3-(c) of LAVISH.

haoyi-duan commented 9 months ago

"Whether all modules of downstream AVQA are trainable, where only grounding modules are initialized with pre-trained weights and others are trained from scratch?" The answer is Yes. "Because I noticed the text encoder (for question) is frozen in the figure 3-(c) of LAVISH." Well, I don't think so. The text encoders for question of LAVisH and our method are the same, both are trained from scratch. Maybe you can tell me which part of the article makes you think that the text encoder for question is frozen.