mit-han-lab / vila-u

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
MIT License
70 stars 2 forks source link

Request for Video Generation Inference Script #1

Open ltzheng opened 1 day ago

ltzheng commented 1 day ago

Thanks for the great work! Do you have any inference script available for video generation? If not, are there plans to release it in the future?

ltzheng commented 1 day ago

Also, the process_image_video in vila_u/data/dataset.py is missing.

zhuoyang20 commented 1 day ago

Hi @ltzheng,

Thank you for your interest! I've added video generation inference. Here's an example command:

CUDA_VISIBLE_DEVICES=0 python inference.py --model_path path/to/your_downloaded_model --prompt "Fireworks in the air." --video_generation True --save_path path/to/save_videos

The problem of process_image_video is also fixed. Please feel free to reach out if you have additional questions.

Best, Zhuoyang