Getting OOM error trying to scale from 1k -> 4k image with 24gb VRAM, What am I doing wrong?

I am trying to scale an image that is 1024x1024 to 4096x4096 I have a 4090 gpu with 24gb of VRAM. When I try to run this upscaler, with a Scale Factor of 4, I get an OOM, where I am informed the system is trying to allocate over 36gb of VRAM to run the upscale.

actual error torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 36.00 GiB. GPU 0 has a total capacty of 23.99 GiB of which 2.18 GiB is free. Of the allocated memory 18.77 GiB is allocated by PyTorch, and 348.81 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The 18.77gb of memory allocated by pytorch is what is occurring before the OOM is hit, so that should be roughly what is available for this process.

These are the settings I am running for this. I am not using TiledVAE or anything else. Screenshot 2024-07-20 131540

I am using a pruned SDXL checkpoint as the model for this (not a lightning model). I have tried a couple different variations of SDXL models just to make sure it isn't limited to one particular model. Running with these setting with a Scale Factor of 2 works just fine. What am I doing wrong?

pkuliyi2015 / multidiffusion-upscaler-for-automatic1111

Getting OOM error trying to scale from 1k -> 4k image with 24gb VRAM, What am I doing wrong? #398