Open llstela opened 6 months ago
i used ~20G
can you offer the complete output from the terminal?
can you offer the complete output from the terminal?
This is the output I tried on 3090 (24GB):
(base) root@a83b401f11b6:/gdata/cold1/shengxuhan/codes/AIGC/PDM-Pure# python pdm_pure.py --image demo/advdm/original.png --save_path demo/advdm/ --device 1
FORCE_MEM_EFFICIENT_ATTN= 0 @UNET:QKVATTENTION
/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py:1125: FutureWarning: The `force_filename` parameter is deprecated as a new caching system, which keeps the filenames as they are on the Hub, is now in place.
warnings.warn(
/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Keyword arguments {'token': None} are not expected by StableDiffusionUpscalePipeline and will be ignored.
Begin to purify demo/advdm/original.png----------
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:18<00:00, 2.65it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:15<00:00, 3.30it/s]
Traceback (most recent call last):
File "/gdata/cold1/shengxuhan/codes/AIGC/PDM-Pure/pdm_pure.py", line 65, in <module>
main()
File "/gdata/cold1/shengxuhan/codes/AIGC/PDM-Pure/pdm_pure.py", line 41, in main
result = style_transfer(
File "/opt/conda/lib/python3.9/site-packages/deepfloyd_if/pipelines/style_transfer.py", line 123, in style_transfer
_stageIII_generations, _meta = if_III.embeddings_to_image(**if_III_kwargs)
File "/opt/conda/lib/python3.9/site-packages/deepfloyd_if/modules/stage_III_sd_x4.py", line 80, in embeddings_to_image
images = self.model(**metadata).images
File "/opt/conda/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py", line 727, in __call__
image = self.vae.decode(latents).sample
File "/opt/conda/lib/python3.9/site-packages/diffusers/models/autoencoder_kl.py", line 191, in decode
decoded = self._decode(z).sample
File "/opt/conda/lib/python3.9/site-packages/diffusers/models/autoencoder_kl.py", line 178, in _decode
dec = self.decoder(z)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/diffusers/models/vae.py", line 233, in forward
sample = self.mid_block(sample)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/diffusers/models/unet_2d_blocks.py", line 463, in forward
hidden_states = attn(hidden_states)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/diffusers/models/attention.py", line 168, in forward
torch.empty(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 1; 23.69 GiB total capacity; 13.30 GiB already allocated; 7.71 GiB free; 15.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Hi, thank you for your excellent work. I'm facing a similar issue—running out of memory with the A40 GPU, which has 46GB of memory, when executing pdm_pure.py at a 512x512 resolution. Do you have any suggestions?
Update: Solved by using xformers: FORCE_MEM_EFFICIENT_ATTN=1 python xxx
Hi, thank you for your excellent work. I'm facing a similar issue—running out of memory with the A40 GPU, which has 46GB of memory, when executing pdm_pure.py at a 512x512 resolution. Do you have any suggestions?
Update: Solved by using xformers:
FORCE_MEM_EFFICIENT_ATTN=1 python xxx
still not solved. I have given up.
i think you need to use effective attn, otherwise it will too costful
Hi, thank you for your excellent work. I'm facing a similar issue—running out of memory with the A40 GPU, which has 46GB of memory, when executing pdm_pure.py at a 512x512 resolution. Do you have any suggestions? Update: Solved by using xformers:
FORCE_MEM_EFFICIENT_ATTN=1 python xxx
still not solved. I have given up.
pip install xformers==0.0.16
is needed. You can find it in DeepFloyd IF.
I tried 32GB and 24GB GPU to run your demo code but all failed with CUDA out of memory.