Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.45k stars 147 forks source link

What is #30

Open olliacc opened 4 months ago

olliacc commented 4 months ago

I'm asking for the lowest amount of GPU video memory (VRAM) necessary to run latte video generation effectively? for both training and inference.

maxin-cn commented 4 months ago

I'm asking for the lowest amount of GPU video memory (VRAM) necessary to run latte video generation effectively? for both training and inference.

Hi, thanks for your interest. Inferencing one video on the A100 requires 20916MiB of GPU memory under fp16 precision mode. As for the GPU memory requirement of training, I think it may be dependent on your batch size.

XGGNet commented 4 months ago

@maxin-cn May i set the local bz=1 for training latte on my own dataset? I mean, I heard that the enough batchsize seems to be key for the training of diffusion. image

Hi, you can set batchsize as 1. But I'm not sure if this will slow down performance. You can try it first. Looking forward to your feedback later~