johannakarras / DreamPose

Official implementation of "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion"
MIT License
962 stars 73 forks source link

fine-tuning is quite GPU memory consuming #26

Closed weizmann closed 1 year ago

weizmann commented 1 year ago

I have used 10 images for the UNet finetune, but I got the cuda out of memory error

It seems that the fine-tune process needs quite large memory, is this behavior expected?

image

Originally posted by @weizmann in https://github.com/johannakarras/DreamPose/issues/23#issuecomment-1518923306

weizmann commented 1 year ago

Q1: I found the unet really need lots of gpu memory, is there any method to reduce the fine-tune gpu memory cost?

Q2: How many images are recommended when fine-tuning the unet?

the following is my debug log print at finetune-unet.py

image

image

weizmann commented 1 year ago

update:

Great! I changed to a 32G V100 GPU to run fine-tuning, it is currently running.

I think 32G GPU is the minimum configuration for fine-tune.

16d559c4-2eed-47c0-971e-f5893043e077

img_v2_f8492072-0646-4666-b2fd-b020d950992g

weizmann commented 1 year ago

Finally passed the UNet fine-tuning, but still failed when fine-tuning vae on my V100 32GB GPU ....

Is there any method to reduce the GPU memory cost? It will be a little expensive to use the 64GB GPU memory server to fine-tune....

I am currently using 10 UBC model instance frames(ffmpeg to image frames from video with resolution: 720 × 940) in demo/instance_data dir for tuning. https://vision.cs.ubc.ca/datasets/fashion/resources/fashion_dataset/fashion_train.txt

I am not sure if I use a smaller resolution will I get a equivalent result?

Waiting for your reply, thanks.

image

weizmann commented 1 year ago

set the train_batch_size param to 2 will exempt from cuda out of memory crash