andreasjansson / cog-stable-diffusion-inpainting

Inpainting using RunwayML's stable-diffusion-inpainting checkpoint
Apache License 2.0
19 stars 9 forks source link

Memory issue #2

Open matbmeijer opened 1 year ago

matbmeijer commented 1 year ago

When using the Replicate API with my own images, I receive a memory issue. This memory issue arises because the app is not resizing the image to the required size for the model. If the images are too big, it fails. Here are the logs:

Running predict()...
Using seed: 17664

  0%|          | 0/50 [00:00<?, ?it/s]
  0%|          | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/src/src/cog/python/cog/server/runner.py", line 288, in _run_prediction
    output = self.predictor.predict(**prediction_input)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 12, in decorate_autocast
    return func(*args, **kwargs)
  File "/src/predict.py", line 81, in predict
    output = self.pipe(
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 392, in __call__
    noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 296, in forward
    sample, res_samples = downsample_block(
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/unet_blocks.py", line 563, in forward
    hidden_states = attn(hidden_states, context=encoder_hidden_states)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/attention.py", line 162, in forward
    hidden_states = block(hidden_states, context=context)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/attention.py", line 211, in forward
    hidden_states = self.attn1(self.norm1(hidden_states)) + hidden_states
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/attention.py", line 283, in forward
    hidden_states = self._attention(query, key, value)
  File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/diffusers/models/attention.py", line 291, in _attention
    attention_scores = torch.matmul(query, key.transpose(-1, -2)) * self.scale
RuntimeError: CUDA out of memory. Tried to allocate 71.54 GiB (GPU 0; 39.59 GiB total capacity; 3.09 GiB already allocated; 3.71 GiB free; 34.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Cdingram commented 1 month ago

I get this almost every time :(