Open 17Reset opened 3 months ago
I have multi GPUs, but only run on one
(deblur_venv) xlab@xlab:/mnt/DiffBIR$ python -u inference.py --version v2 --task sr --upscale 4 --cfg_scale 4.0 --input /mnt/image_demos/ --output /mnt/image_demos/ouput --device cuda use sdp attention as default keep default attention mode using device cuda [3, 3, 64, 23, 32, 4] Downloading: "https://github.com/cszn/KAIR/releases/download/v1.0/BSRNet.pth" to /mnt/DiffBIR/weights/BSRNet.pth 100%|██████████████████████████████████████████████████████████████████████████| 63.9M/63.9M [00:07<00:00, 8.92MB/s] Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. building SDPAttnBlock (sdp) with 512 in_channels building SDPAttnBlock (sdp) with 512 in_channels Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads. Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads. Downloading: "https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt" to /mnt/DiffBIR/weights/v2-1_512-ema-pruned.ckpt 100%|██████████████████████████████████████████████████████████████████████████| 4.86G/4.86G [12:50<00:00, 6.77MB/s] strictly load pretrained sd_v2.1, unused weights: {'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'betas', 'model_ema.num_updates', 'alphas_cumprod', 'sqrt_recipm1_alphas_cumprod', 'log_one_minus_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'alphas_cumprod_prev', 'posterior_log_variance_clipped', 'model_ema.decay', 'sqrt_alphas_cumprod', 'sqrt_recip_alphas_cumprod'} Downloading: "https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v2.pth" to /mnt/DiffBIR/weights/v2.pth 100%|██████████████████████████████████████████████████████████████████████████| 1.35G/1.35G [03:07<00:00, 7.75MB/s] strictly load controlnet weight load lq: /mnt/image_demos/01.png Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [01:37<00:00, 1.96s/it] save result to /mnt/image_demos/ouput/01.png load lq: /mnt/image_demos/input2.jpg Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [00:46<00:00, 1.07it/s] save result to /mnt/image_demos/ouput/input2.png load lq: /mnt/image_demos/oringnal.jpeg Traceback (most recent call last): File "/mnt/DiffBIR/inference.py", line 86, in <module> main() File "/mnt/DiffBIR/inference.py", line 81, in main supported_tasks[args.task](args).run() File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/DiffBIR/utils/inference.py", line 147, in run sample = self.pipeline.run( File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/DiffBIR/utils/helpers.py", line 148, in run sample = self.run_stage2( File "/mnt/DiffBIR/utils/helpers.py", line 86, in run_stage2 cond = self.cldm.prepare_condition(pad_clean, [pos_prompt] * bs) File "/mnt/DiffBIR/model/cldm.py", line 133, in prepare_condition c_img=self.vae_encode(clean * 2 - 1, sample=False) File "/mnt/DiffBIR/model/cldm.py", line 96, in vae_encode return self.vae.encode(image).mode() * self.scale_factor File "/mnt/DiffBIR/model/vae.py", line 550, in encode h = self.encoder(x) File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/mnt/DiffBIR/model/vae.py", line 414, in forward h = self.mid.attn_1(h) File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/mnt/DiffBIR/model/vae.py", line 295, in forward out = F.scaled_dot_product_attention(q, k, v) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 42.76 GiB. GPU 0 has a total capacity of 47.41 GiB of which 9.80 GiB is free. Including non-PyTorch memory, this process has 37.58 GiB memory in use. Of the allocated memory 24.53 GiB is allocated by PyTorch, and 12.55 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Hello,do you solve the problem?I have the same question.
I have 24 giga card but it's not enough so I have another card with 12 giga but it's not possible to use both to increase memory :( could it be at least make possible to run the guidance on the other card ?
same issue with mine. I'm trying to modify codes with pytorch-lightning
I have multi GPUs, but only run on one