fastai / diffusion-nbs

Getting started with diffusion
Apache License 2.0
620 stars 276 forks source link

Stable Diffusion Deep Dive fails at UNET and CFG with IndexError: index 51 is out of bounds for dimension 0 with size 51 #42

Closed aryamannaik closed 5 months ago

aryamannaik commented 11 months ago

Full error here:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In [63], line 19
     16 # Get the predicted x0:
     17 # latents_x0 = latents - sigma * noise_pred # Calculating ourselves
     18 print(noise_pred.shape, t, latents.shape)
---> 19 latents_x0 = scheduler.step(noise_pred, t, latents).pred_original_sample # Using the scheduler (Diffusers 0.4 and above)
     21 # compute the previous noisy sample x_t -> x_t-1
     22 latents = scheduler.step(noise_pred, t, latents).prev_sample

File /usr/local/lib/python3.9/dist-packages/diffusers/schedulers/scheduling_lms_discrete.py:405, in LMSDiscreteScheduler.step(self, model_output, timestep, sample, order, return_dict)
    403 # 3. Compute linear multistep coefficients
    404 order = min(self.step_index + 1, order)
--> 405 lms_coeffs = [self.get_lms_coefficient(order, self.step_index, curr_order) for curr_order in range(order)]
    407 # 4. Compute previous sample based on the derivatives path
    408 prev_sample = sample + sum(
    409     coeff * derivative for coeff, derivative in zip(lms_coeffs, reversed(self.derivatives))
    410 )

File /usr/local/lib/python3.9/dist-packages/diffusers/schedulers/scheduling_lms_discrete.py:405, in <listcomp>(.0)
    403 # 3. Compute linear multistep coefficients
    404 order = min(self.step_index + 1, order)
--> 405 lms_coeffs = [self.get_lms_coefficient(order, self.step_index, curr_order) for curr_order in range(order)]
    407 # 4. Compute previous sample based on the derivatives path
    408 prev_sample = sample + sum(
    409     coeff * derivative for coeff, derivative in zip(lms_coeffs, reversed(self.derivatives))
    410 )

File /usr/local/lib/python3.9/dist-packages/diffusers/schedulers/scheduling_lms_discrete.py:233, in LMSDiscreteScheduler.get_lms_coefficient(self, order, t, current_order)
    230         prod *= (tau - self.sigmas[t - k]) / (self.sigmas[t - current_order] - self.sigmas[t - k])
    231     return prod
--> 233 integrated_coeff = integrate.quad(lms_derivative, self.sigmas[t], self.sigmas[t + 1], epsrel=1e-4)[0]
    235 return integrated_coeff

IndexError: index 51 is out of bounds for dimension 0 with size 51

I tried playing around with the indices, but seems like it is another issue. Moving to an older checkout doesn't fix it either.

johnowhitaker commented 11 months ago

Hmm, I can't re-create. If it's failing at the last step you could stop a few steps early (if i == 45: break) and you'll get to see the animation even if it isn't ideal. Could you confirm that a fresh copy of the notebook fails here when you do 'Run all'?

venkyyuvy commented 11 months ago

workaround

        latents = latents.detach() - cond_grad * sigma**2
        scheduler._step_index = scheduler._step_index - 1

or uncomment the manual stepping instead scheduler stepping

# Get the predicted x0:
latents_x0 = latents - sigma * noise_pred
# latents_x0 = scheduler.step(noise_pred, t, latents).pred_original_sample
rohit901 commented 8 months ago

yeah getting the same error with index at UNET and CFG sections in the code to generate the video.. I tried reducing num_inference_steps to 45 but same error happens saying index 46 is out of bounds..

@venkyyuvy could you explain how exactly to use the first part of your workaround? I don't think cond_grad is defined in the code [nvm, found it in CFG section].. regarding the uncommenting part, I feel doing it through API (step) would be better than manually doing it like that right?

rohit901 commented 8 months ago

Okay for those who are curious, I was able to fix it by making this change in the code, I think we were doing step twice in the same loop and it should be done once per loop right? correct me if I'm wrong here @johnowhitaker

Earlier code:

# Get the predicted x0:
latents_x0 = scheduler.step(noise_pred, t, latents).pred_original_sample # Using the scheduler (Diffusers 0.4 and above)

# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents).prev_sample

Modified code (call step only once and use intermediate variable scheduler_output):

scheduler_output = scheduler.step(noise_pred, t, latents)

latents_x0 = scheduler_output.pred_original_sample # Using the scheduler (Diffusers 0.4 and above)

# compute the previous noisy sample x_t -> x_t-1
latents = scheduler_output.prev_sample