Open mvnowak opened 2 years ago
Solution: In the make_ddim_timesteps()
function in ldm/modules/diffusionmodules/util.py
change line 49 which has the following code: ddim_timesteps = np.asarray(list(range(0, num_ddpm_timesteps, c)))
to ddim_timesteps = (np.arange(0, num_ddim_timesteps) * c).astype(int)
Reasoning:
The problem is coming from the make_ddim_timesteps
function in ldm/modules/diffusionmodules/util.py
Line 49 has the following line of code ddim_timesteps = np.asarray(list(range(0, num_ddpm_timesteps, c)))
. The goal of this code is to sample num_ddim_timesteps
evenly strided integers in the range from 0 to num_ddpm_timesteps
at a stride of c
where c = num_ddpm_timesteps // num_ddim_timesteps
. Though the problem is that the code often samples num_ddim_timesteps + 1
timesteps. That is, ddim_timesteps.shape[0]
does NOT equal num_ddim_timesteps
(which is bad).
It just so happens that one of the bad cases (when ddim_timesteps.shape[0] != num_ddim_timesteps
) is when num_ddim_timesteps % 3 == 0
. Furthermore, when num_ddim_timesteps
is a power of 3 the extra sampled integer in ddim_timesteps
(that should not be there) happens to be the integer 999. Then in line 58 we compute steps_out = ddim_timesteps + 1
causing 999 -> 1000 which results in your out of bounds error.
The solution to solve both problems (remove all bad cases where ddim_timesteps.shape[0] != num_ddim_timesteps
and therefore your out of bounds error) is to replace line 49 with the following code:
ddim_timesteps = (np.arange(0, num_ddim_timesteps) * c).astype(int)
Important: This code is equivalent to the previous code in the working cases though excludes the extra time step in the bad cases. Therefore, it should be a simple and general fix to two bugs. Note that I keep the same astype(int)
notation in the ddim_discr_method == “quad”
case in line 51.
Funny enough, they had an assertion on line 56 to make sure that ddim_timesteps.shape[0] == num_ddim_timesteps
, though it seems to be commented out. Maybe someone meant to go back and fix it though never did.
Another option would be to try diffusers
:
# make sure you're logged in with `huggingface-cli login`
from torch import autocast
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
use_auth_token=True
).to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt, num_inference_steps=27)["sample"][0]
image.save("astronaut_rides_horse.png")
ryanirl's solution also solves the problem when u specify a larger ddim_step, the c
in the original line will be 0, causing a third parameter can not be 0
error.
There should be a merge request for this.
@patrickvonplaten unfortunately, your code sample results in the same bug:
Traceback (most recent call last):
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 12, in <module>
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 273, in __call__
latents = self.scheduler.step(noise_pred, t, latents,
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/site-packages/diffusers/schedulers/scheduling_pndm.py", line 202, in step
return self.step_plms(model_output=model_output, timestep=timestep, sample=sample, return_dict=return_dict)
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/site-packages/diffusers/schedulers/scheduling_pndm.py", line 317, in step_plms
prev_sample = self._get_prev_sample(sample, timestep, prev_timestep, model_output)
File "/home/vladimir/miniconda3/envs/SD/lib/python3.9/site-packages/diffusers/schedulers/scheduling_pndm.py", line 338, in _get_prev_sample
alpha_prod_t = self.alphas_cumprod[timestep + 1 - self._offset]
IndexError: index 1000 is out of bounds for dimension 0 with size 1000
Hey @vvsotnikov thanks for the message!
Could you try updating your diffusers
to 0.4.0.dev0
?
# make sure you're logged in with `huggingface-cli login`
from torch import autocast
from diffusers import StableDiffusionPipeline
import diffusers
print("diffusers vesion", diffusers.__version__)
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
use_auth_token=True
).to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt, num_inference_steps=27)["sample"][0]
image.save("astronaut_rides_horse.png")
This works for me with output:
diffusers vesion 0.4.0.dev0
There is certain values for the
ddim_steps
parameter, for which the model crashes with the stack trace appended at the end.Example that leads to the crash:
python scripts/txt2img.py --prompt "tree" --ddim_steps 9 --n_samples 1
Stacktrace: