selecting the time step for gradient truncation

daewon88 commented 6 months ago

Hi! Thank you for sharing your valuable work.

In the code, selecting the timestep for gradient truncation occurs every denoising step (lines 483-485). However, if the intention is to sample the truncation timestep from U(0,50), there may be some issues with this approach. Therefore, I suggest that selecting the timestep for gradient truncation should happen before each sampling, rather than each denoising step. Do you have any particular reason for this choice?

Thank you :)

if config.truncated_backprop:
    if config.truncated_backprop_rand:
        timestep = random.randint(config.truncated_backprop_minmax[0],config.truncated_backprop_minmax[1])
        if i < timestep:
            noise_pred_uncond = noise_pred_uncond.detach()
            noise_pred_cond = noise_pred_cond.detach()

ajaysub110 commented 3 months ago

I had the same question while going through the code. @mihirp1998 do you have a particular reason for choosing this approach over what @daewon88 suggests? Thanks!

mihirp1998 commented 3 months ago

Thanks for catching this bug!

The current code is not using U(0,50), but it is instead a gaussian distribution that is centered at 42, i haven't ablated this with U(0,50), but once i do i'll add this as an option in the code.

Figure_1

mihirp1998 / AlignProp

selecting the time step for gradient truncation #15