open-mmlab / StyleShot

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!
https://styleshot.github.io/
MIT License
209 stars 12 forks source link

GPU Memory outage #17

Closed TasosTzaras closed 2 weeks ago

TasosTzaras commented 3 weeks ago

i'm just running styleshot_image_driven_demo.py on an RTX 2080Ti with 11GB memory and i get this error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 600.00 MiB. GPU 0 has a total capacty of 10.75 GiB of which 501.31 MiB is free. Including non-PyTorch memory, this process has 9.95 GiB memory in use. Of the allocated memory 9.64 GiB is allocated by PyTorch, and 141.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

is it a heavy procedure to just run the style transfer in 2 images?

Jeoyal commented 3 weeks ago

Hi @TasosTzaras , thank you for your interest in our work. StyleShot loads the style-aware encoder, content fusion encoder (if needed), and SD v1.5 into GPU memory, typically requiring 8.36 to 10.5 GB of VRAM. You can opt not to load the style-aware encoder into GPU memory to free up additional VRAM (line 467 in ip_adapter.py). We also recommend you to try our online demo on OpenXlab.

TasosTzaras commented 3 weeks ago

so commenting this line (467) throws error because the whole code uses the style_aware_encoder. what should i do to fix it?

Jeoyal commented 3 weeks ago

Please do not comment any line. Inserting self.style_device = "cpu" in line 466 and modify lines 467 and 547 into self.style_aware_encoder= Style_Aware_Encoder(CLIPVisionModelWithProjection.from_pretrained(transformer_patch)).to(self.style_device, dtype=torch.float32) (line 467) and style_image = StyleProcessor(style_image, self.style_device) (line 547).

TasosTzaras commented 3 weeks ago

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 374.00 MiB. GPU 0 has a total capacty of 10.75 GiB of which 344.19 MiB is free. Including non-PyTorch memory, this process has 9.96 GiB memory in use. Of the allocated memory 8.99 GiB is allocated by PyTorch, and 788.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

it still crashes.

Jeoyal commented 3 weeks ago

What is your batch_size value? You can also change all torch.float32 into float16 to free more VRAM.

Jeoyal commented 3 weeks ago

What is the resolution of the image you input?

TasosTzaras commented 3 weeks ago

i cant change the batch size in inference. only in training target image : 1920x1276 style image : 1879x1500

Jeoyal commented 3 weeks ago

Simply resizing your target image (content image) to a lower resolution, typically 512x512, should be fine.

TasosTzaras commented 3 weeks ago

that was my last choice and i wouldnt like to do that but i guess its my only option. i didnt know it was such a heavy procedure. thank you anyways!

Jeoyal commented 3 weeks ago

You're welcome! (1024x1024 might be fine, I guess.) By the way, you can try higher resolutions on our free online demo at OpenXlab. (Or changing the dtype torch.float32 to lower precision might be helpful.)

TasosTzaras commented 2 weeks ago

by the way when i change everything to float16 i get this error: RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' :D

Jeoyal commented 2 weeks ago

This might be because the style-aware encoder was set to run on the CPU in the previous steps, which is not compatible with float16. You can load all models and data onto the CPU to run with float32, or you can run only the style-aware encoder onto cpu with float32 and then convert its output to float16.

TasosTzaras commented 2 weeks ago

thank you so much, i changed everything back to the starting code and just converted float32 to float16 and it worked!

Jeoyal commented 2 weeks ago

You're welcome! Hope you achieve the results you desire. :)