3. Enable CPU offload for the model.

turn off if you have multiple GPUs or enough GPU memory(such as H100) and it will cost less time in inference

and enable to("cuda")

pipe.to("cuda")

pipe.enable_sequential_cpu_offload()

pipe.vae.enable_slicing() pipe.vae.enable_tiling() $ python cli_demo.py --prompt "A girl riding a bike." --model_path THUDM/CogVideoX-5b --generate_type "t2v" Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.13s/it] Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00, 1.37s/it] Traceback (most recent call last): File "/home/ubuntu/gamehub/CogVideo/inference/cli_demo.py", line 177, in generate_video( File "/home/ubuntu/gamehub/CogVideo/inference/cli_demo.py", line 99, in generate_video pipe.to("cuda") File "/opt/conda/envs/cogvideo/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 396, in to raise ValueError( ValueError: It seems like you have activated sequential model offloading by calling enable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline .to('cpu') or consider removing the move altogether if you use sequential offloading.

Expected behavior / 期待表现

MULTY GPU SUPPORT

zRzRzRzRzRzRzR commented 2 days ago

If you enable multi-card GPU, you must remove enable_sequential_cpu_offload and use pipe.to("cuda") instead.

jumbo-q commented 2 days ago

enable_sequential_cpu_offload has been removed yesterday cli_demo.py

and also add use pipe.to("cuda") instead.

but it still says ValueError: It seems like you have activated sequential model offloading by calling enable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline .to('cpu') or consider removing the move altogether if you use sequential offloading.

Is there any other .py use enable_sequential_cpu

THUDM / CogVideo

Multy GPU Error #435

System Info / 系統信息

Information / 问题信息

Reproduction / 复现过程

3. Enable CPU offload for the model.

turn off if you have multiple GPUs or enough GPU memory(such as H100) and it will cost less time in inference

and enable to("cuda")

pipe.enable_sequential_cpu_offload()

Expected behavior / 期待表现