Different text prompts generate almost the same video generation result.

wenyuqing / panacea

[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"

https://panacea-ad.github.io/

Apache License 2.0

174 stars 8 forks source link

Different text prompts generate almost the same video generation result. #23

Open Wangyupei opened 1 week ago

Wangyupei commented 1 week ago

Thanks for your great work. However, in our experiment, we tried different text prompts according to your instruction (the dataset preparation and inference code), the video generation results are almost the same. Is there anything wrong?

wenyuqing commented 4 days ago

Hi，sorry for the lat reply. The current code use the GT image as conditional frame and generate the subsequent video frames for inference, so modifying the text prompt cannot modify the textual attributes well because the subsequent video frames are highly correlated with the conditional frame.