YangLing0818 / RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
https://proceedings.mlr.press/v235/yang24ai.html
MIT License
1.7k stars 99 forks source link

24GB VRAM not enough? #7

Open JosefKuchar opened 10 months ago

JosefKuchar commented 10 months ago

Ran into oom when running demo, is 24gb not enough?:

(rpg) xkucha28@pcknot6:/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster$ python RPG.py --demo
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/pytorch_lightning/utilities/distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Style database not found: /mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/styles.csv
Script ScriptClassData(script_class=<class 'rp.py.Script'>, path='/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/scripts/rp.py', basedir='/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster', module=<module 'rp.py' from '/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/scripts/rp.py'>)
script <rp.py.Script object at 0x7fdd3df435e0>
script.filename /mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/scripts/rp.py
self.selectable [<rp.py.Script object at 0x7fdd3df435e0>]
txt2img_scripts [<rp.py.Script object at 0x7fdd3df435e0>]
select_checkpoint: albedobaseXL_v20.safetensors
Checkpoint albedobaseXL_v20.safetensors not found; loading fallback v1-5-pruned-emaonly.safetensors [6ce0161689]
Loading weights [6ce0161689] from /mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Creating model from config: /mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/configs/v1-inference.yaml
creating model quickly: OSError
Traceback (most recent call last):
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
    response.raise_for_status()
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/None/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/transformers/utils/hub.py", line 385, in cached_file
    resolved_file = hf_hub_download(
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1374, in hf_hub_download
    raise head_call_error
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
    r = _request_wrapper(
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
    response = _request_wrapper(
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 426, in _request_wrapper
    hf_raise_for_status(response)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 320, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-65b1214b-7202ea3563c33c0b49b81028;4c9edc25-59d8-45c4-a3bb-0e9a61868d0e)

Repository Not Found for url: https://huggingface.co/None/resolve/main/config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/RPG.py", line 224, in <module>
    initialize(model_name='albedobaseXL_v20.safetensors')
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/RPG.py", line 65, in initialize
    modules.sd_models.load_model(model_name=model_name)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_models.py", line 636, in load_model
    sd_model = instantiate_from_config(sd_config.model)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/util.py", line 89, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 563, in __init__
    self.instantiate_cond_stage(cond_stage_config)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 630, in instantiate_cond_stage
    model = instantiate_from_config(config)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/util.py", line 89, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/encoders/modules.py", line 104, in __init__
    self.transformer = CLIPTextModel.from_pretrained(version)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_disable_initialization.py", line 68, in CLIPTextModel_from_pretrained
    res = self.CLIPTextModel_from_pretrained(None, *model_args, config=pretrained_model_name_or_path, state_dict={}, **kwargs)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2928, in from_pretrained
    resolved_config_file = cached_file(
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/transformers/utils/hub.py", line 406, in cached_file
    raise EnvironmentError(
OSError: None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

Failed to create model quickly; will retry using slow method.
/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/pytorch_lightning/utilities/distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Model loaded in 3.9s (create model: 1.4s, apply weights to model: 2.0s, calculate empty prompt: 0.4s).
demo_ 0
demo
select_checkpoint: v1-5-pruned-emaonly.safetensors [6ce0161689]
process_script_args (True, False, 'Matrix', 'Columns', 'Mask', 'Prompt', '1,1,1; 1,1,1', 0.2, True, False, False, 'Attention', [False], 0, 0, 0.4, None, 0, 0, False)
fatal: No names found, cannot describe anything.
1,1,1; 1,1,1 0.2 Horizontal
Regional Prompter Active, Pos tokens : [8, 22, 25, 20, 24], Neg tokens : [0]
select_checkpoint: v1-5-pruned-emaonly.safetensors [6ce0161689]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:19<00:00,  1.03it/s]
Total progress: 20it [00:19,  1.03it/s]
demo_ 1rogress: 20it [00:19,  1.07it/s]
demo
select_checkpoint: v1-5-pruned-emaonly.safetensors [6ce0161689]
process_script_args (True, False, 'Matrix', 'Columns', 'Mask', 'Prompt', '1,1,1;2,1,1;4,3,2,3', 0.2, True, False, False, 'Attention', [False], 0, 0, 0.4, None, 0, 0, False)
1,1,1;2,1,1;4,3,2,3 0.2 Horizontal
Regional Prompter Active, Pos tokens : [70, 7, 7, 13, 13, 18, 15, 18], Neg tokens : [0]
select_checkpoint: v1-5-pruned-emaonly.safetensors [6ce0161689]
  0%|                                                                                                                                                       | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/RPG.py", line 225, in <module>
    demo_version(demo_list)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/RPG.py", line 160, in demo_version
    image=RPG(user_prompt=user_prompt,
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/RPG.py", line 108, in RPG
    image, _, _, _ = modules.txt2img.txt2img(
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/txt2img.py", line 73, in txt2img
    processed = process_images(p)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/processing.py", line 734, in process_images
    res = process_images_inner(p)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/processing.py", line 868, in process_images_inner
    samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/processing.py", line 1142, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_samplers_kdiffusion.py", line 235, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_samplers_common.py", line 261, in launch_sampling
    return func()
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_samplers_kdiffusion.py", line 235, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_samplers_cfg_denoiser.py", line 188, in forward
    x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond=make_condition_dict(c_crossattn, image_cond_in[a:b]))
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/sd_unet.py", line 91, in UNetModel_forward
    return original_forward(self, x, timesteps, context, *args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
    h = module(h, emb, context)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
    x = block(x, context=context[i])
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/tmp/xkucha28/miniconda3/envs/rpg/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/minerva1/nlp/projects/text2video/RPG-DiffusionMaster/modules/hypernetworks/hypernetwork.py", line 393, in attention_CrossAttention_forward
    sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 GiB (GPU 0; 23.69 GiB total capacity; 18.19 GiB already allocated; 4.06 GiB free; 18.64 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
YangLing0818 commented 10 months ago

Thanks for your comments. We recommend you use SDXL model as the diffusion backbone, which would definitely cost no more than 24GB. And kindly note that the cost is positively correlated with the complexity of text prompt.

kowalgregy commented 10 months ago

I had this problem when running the demo (python RPG.py --demo) on 32GB VRAM; running a normal prompt worked (with API key etc.). However, I still don't understand how to use it; I got two people in the output (and that wasn't the prompt). Digging into it :)

bank010 commented 10 months ago

感谢您的评论。我们建议您使用 SDXL 模型作为扩散主干,其成本肯定不超过 24GB。请注意,成本与文本提示的复杂性呈正相关。

如何多卡加载

mrschmiklz commented 9 months ago

@YangLing0818 Is there a simple way to enable multi-gpu for this?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 7.04 GiB. GPU 0 has a total capacty of 24.00 GiB of which 4.76 GiB is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 10.53 GiB is allocated by PyTorch, and 7.39 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I have two 3090s

Ubuntu 22.04.3 LTS (GNU/Linux 5.15.133.1-microsoft-standard-WSL2 x86_64)

Great work!

adammenges commented 9 months ago

same here, 24GB GPU (A10) but it fails

nvida-smi says before running demo there's 0 memory usage.