haoningwu3639 / StoryGen

[CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models
https://haoningwu3639.github.io/StoryGen_Webpage/
MIT License
207 stars 11 forks source link

Stage's option #41

Open 0Tzero opened 1 month ago

0Tzero commented 1 month ago

Hello!Thanks for your code and checkpoint!Could you please explain what "multi-image-condition" and "auto-regressive" mean in the context of the model?

haoningwu3639 commented 3 weeks ago

Please refer to the Figure of model architecture in our paper and the appendix, where we provide another model architecture for experiments conducted on the MS-COCO dataset. The multi-image condition and auto-regressive settings correspond to these two different experimental setups, respectively.