Closed zxs789 closed 4 months ago
I use open-sora-v1.1 config: configs/opensora-v1-1/inference/sample.py with multi_resolution = "STDiT2", Is it because SeqParallelAttention is not used in stdit2.py?
"I also ran the commands again using version 1: The command for dual GPUs is: CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/16x512x512.py --prompt-path ./assets/texts/t2v_samples.txt --prompt 'A beautiful sunset over the city' The command for a single GPU is: python3 scripts/inference.py configs/opensora/inference/16x512x512.py --prompt-path ./assets/texts/t2v_samples.txt --prompt 'A beautiful sunset over the city'
The memory usage and inference time are as follows: Dual GPU (GPU 0): 27170MB / 30s Single GPU: 23298MB / 44s Is this behavior normal?", why did the memory usage increase?
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node 4 scripts/inference.py configs/opensora-v1-1/inference/sample.py --num-frames 160 --image-size 512 512
When I perform inference using 4 A100 GPUs, the memory usage on each of the 4 GPUs is the same as when I use just one GPU, which is 41GB. Is this normal? Does multi-GPU inference reduce memory usage?