jy0205 / Pyramid-Flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
https://pyramid-flow.github.io/
MIT License
2.4k stars 233 forks source link

`extract_vae_latent.sh` CUDA out of memory #167

Open JWargrave opened 2 weeks ago

JWargrave commented 2 weeks ago

我运行以下脚本来extract vae latent,分辨率是768p(即WIDTH=1280,HEIGHT=768),NUM_FRAMES降到17了还是CUDA out of memory,这要怎么解决呢?

用的是8卡H800,80G显存。

#!/bin/bash

# This script is used for batch extract the vae latents for video generation training
# Since the video latent extract is very slow, pre-extract the video vae latents will save the training time

GPUS=8  # The gpu number
MODEL_NAME=pyramid_flux     # The model name, `pyramid_flux` or `pyramid_mmdit`
VAE_MODEL_PATH=pretrained_weights/pyramid-flow-miniflux/causal_video_vae  # The VAE CKPT dir.
ANNO_FILE=path   # The video annotation file path
WIDTH=1280
HEIGHT=768
NUM_FRAMES=17

torchrun --nproc_per_node $GPUS \
    tools/extract_video_vae_latents.py \
    --batch_size 1 \
    --model_dtype bf16 \
    --model_path $VAE_MODEL_PATH \
    --anno_file $ANNO_FILE \
    --width $WIDTH \
    --height $HEIGHT \
    --num_frames $NUM_FRAMES
jy0205 commented 1 week ago

请问您解决问题了吗?80G显存应该不可能炸的呀;您是不是window size设置的太大了

JWargrave commented 1 week ago

请问您解决问题了吗?80G显存应该不可能炸的呀;您是不是window size设置的太大了

是的,我把window size调小了就能跑起来了,请问一下window size是否会对最终的抽取结果产生影响?还是说只影响速度?

yuchen1984 commented 1 week ago

Add the flag --save_memory to enable vae_tiling. This will work under <8GB VRAM