Closed TasosTzaras closed 2 weeks ago
Hi @TasosTzaras , thank you for your interest in our work. StyleShot loads the style-aware encoder, content fusion encoder (if needed), and SD v1.5 into GPU memory, typically requiring 8.36 to 10.5 GB of VRAM. You can opt not to load the style-aware encoder into GPU memory to free up additional VRAM (line 467 in ip_adapter.py). We also recommend you to try our online demo on OpenXlab.
so commenting this line (467) throws error because the whole code uses the style_aware_encoder. what should i do to fix it?
Please do not comment any line. Inserting self.style_device = "cpu"
in line 466 and modify lines 467 and 547 into self.style_aware_encoder= Style_Aware_Encoder(CLIPVisionModelWithProjection.from_pretrained(transformer_patch)).to(self.style_device, dtype=torch.float32)
(line 467) and style_image = StyleProcessor(style_image, self.style_device)
(line 547).
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 374.00 MiB. GPU 0 has a total capacty of 10.75 GiB of which 344.19 MiB is free. Including non-PyTorch memory, this process has 9.96 GiB memory in use. Of the allocated memory 8.99 GiB is allocated by PyTorch, and 788.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
it still crashes.
What is your batch_size value? You can also change all torch.float32 into float16 to free more VRAM.
What is the resolution of the image you input?
i cant change the batch size in inference. only in training target image : 1920x1276 style image : 1879x1500
Simply resizing your target image (content image) to a lower resolution, typically 512x512, should be fine.
that was my last choice and i wouldnt like to do that but i guess its my only option. i didnt know it was such a heavy procedure. thank you anyways!
You're welcome! (1024x1024 might be fine, I guess.) By the way, you can try higher resolutions on our free online demo at OpenXlab. (Or changing the dtype torch.float32 to lower precision might be helpful.)
by the way when i change everything to float16 i get this error: RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' :D
This might be because the style-aware encoder was set to run on the CPU in the previous steps, which is not compatible with float16. You can load all models and data onto the CPU to run with float32, or you can run only the style-aware encoder onto cpu with float32 and then convert its output to float16.
thank you so much, i changed everything back to the starting code and just converted float32 to float16 and it worked!
You're welcome! Hope you achieve the results you desire. :)
i'm just running styleshot_image_driven_demo.py on an RTX 2080Ti with 11GB memory and i get this error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 600.00 MiB. GPU 0 has a total capacty of 10.75 GiB of which 501.31 MiB is free. Including non-PyTorch memory, this process has 9.95 GiB memory in use. Of the allocated memory 9.64 GiB is allocated by PyTorch, and 141.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
is it a heavy procedure to just run the style transfer in 2 images?