boschresearch / ALDM

Official implementation of "Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive" (ICLR 2024)
https://yumengli007.github.io/ALDM/
GNU Affero General Public License v3.0
51 stars 3 forks source link

Need for captions for new dataset? #9

Closed yassine9666 closed 2 months ago

yassine9666 commented 4 months ago

I want to train the model on another urban dataset. One folder should contain the segmentation masks and the other the images right? Also, do I need the caption.json for my new dataset? Thank you :)

YumengLi007 commented 4 months ago

Hi @yassine9666 , you would need the captions for training. You could using some VL-captioning models, e.g., BLIP, LLaVA, to get the caption :)