amirbar / visual_prompting

Official implementation and data release of the paper "Visual Prompting via Image Inpainting".
https://yossigandelsman.github.io/visual_prompt
294 stars 20 forks source link

Finetune on other works/datasets #1

Closed mrluin closed 1 year ago

mrluin commented 1 year ago

Hello,

Thanks for your code and novel solution to visual prompt tuning!

If i want to finetune the pretrained model you provided on other works (assumpt that i have standard training pairs), what i understand the manucrafted input-label pairs should be:

  1. stitch training pairs into 2x2 grids, as label.
  2. leave the bottom-right image zero, as input.
  3. Finetune!

Is that right? Could you please tell me how can i do it?

amirbar commented 1 year ago

You can leave zeros, or even keep the bottom right part masked.