hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All
https://hpcaitech.github.io/Open-Sora/
Apache License 2.0
20.1k stars 1.9k forks source link

Use Case Example with 2x 3090ti - Great job on this devs! #563

Open Omarch47 opened 3 days ago

Omarch47 commented 3 days ago

I wanted to share a quick demonstration of a home use case for this that I made: https://youtu.be/tiQpw50lKjU

Great job and thanks very much for releasing this for local use! I did a quick bit of testing with this on a more consumer spec machine and figured I would share for those of us who are interested in running this but only have one or two 24gb cards available to use. I am very excited about the future of this technology and am glad to have a way to generate videos, even if they are short, without having to sign up for a paid subscription service online.

deeplearning666 commented 3 days ago

Hello, I am also using 2x3090. Do you use torchrun or gradio for forward inferencing, or by "scripts/inference.py"? I'm always running failedly with out of memory, or I can only call 1 GPU. Can you give me some advice? I'm looking forward to hearing from you soon!

Omarch47 commented 3 days ago

Hello, I am also using 2x3090. Do you use torchrun or gradio for forward inferencing, or by "scripts/inference.py"? I'm always running failedly with out of memory, or I can only call 1 GPU. Can you give me some advice? I'm looking forward to hearing from you soon!

I am using torchrun without apex or flash-attn. I tried the single command for inference:

python scripts/inference.py configs/opensora-v1-2/inference/sample.py \
  --num-frames 4s --resolution 240p \
  --layernorm-kernel False --flash-attn False \
  --prompt "a beautiful waterfall"

And could not get it to work. I also could not get gradio to work as it said out of memory when trying to start the gradio app. Since "sequence parallelism is not supported for gradio deployment" I did not really try any troubleshooting with the gradio app since I wanted to use both cards.

The command I ended up successfully using is:

CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 scripts/inference.py configs/opensora-v1-2/inference/sample.py   --num-frames 4s --resolution 240p --aspect-ratio 1:1   --layernorm-kernel False --flash-attn False   --prompt "create a video of a car driving down a road." `

Keep in mind I did not have apex or flash-attn so I disabled them with: --layernorm-kernel False --flash-attn False \

I tried going above 240p and would run out of memory. I also kept the aspect ratio at 1:1 because it worked for my case and I didn't want to risk messing anything up. I am not an expert by far, but if you still have issues and want to paste the error message I will take a look at it to see if it is anything I struggled with.