dyabel / detpro

Apache License 2.0
171 stars 26 forks source link

Understand and reproduce the work #22

Closed ZhuoranYu closed 2 years ago

ZhuoranYu commented 2 years ago

Hi,

Thanks for your great work!

I'm trying to reproduce the ViLD baseline in your repo but still having some trouble understanding it. Here are my questions:

  1. In readme, the configs of ViLD and DetPro both point to the same file: detpro_ens_20e.py and you have confirmed in other issues that this is correct. In this case, how would you determine whether the model uses learnt prompt as in DetPro or manual prompt as in vanilla ViLD. Could you please share the running command for reproducing ViLD? Really appreciate it.

  2. In those configs, they both use Shared4Conv1FCBBoxHead as the RoI head. However, this module seems not leverage text embeddings for training and inference. Should something like StandardRoIHeadTEXT be used instead? I'm not sure if my understanding is correct and please do correct me if my understanding is wrong.

Thanks for your time again!

dyabel commented 2 years ago

Hi,

Thanks for your great work!

I'm trying to reproduce the ViLD baseline in your repo but still having some trouble understanding it. Here are my questions:

  1. In readme, the configs of ViLD and DetPro both point to the same file: detpro_ens_20e.py and you have confirmed in other issues that this is correct. In this case, how would you determine whether the model uses learnt prompt as in DetPro or manual prompt as in vanilla ViLD. Could you please share the running command for reproducing ViLD? Really appreciate it.
  2. In those configs, they both use Shared4Conv1FCBBoxHead as the RoI head. However, this module seems not leverage text embeddings for training and inference. Should something like StandardRoIHeadTEXT be used instead? I'm not sure if my understanding is correct and please do correct me if my understanding is wrong.

Thanks for your time again!

Hi,just replace the prompt path with the manual defined prompt which is lvis_clip_text_embedding.pt provided in the link or generated by yourself (I have written the generating process in the standard_roi_head.py). You can try this ./tools/dist_train.sh configs/lvis/detpro_ens_20e.py 8 --work-dir workdirs/vild --cfg-options model.roi_head.prompt_path=lvis_clip_text_embedding.pt model.roi_head.load_feature=True