Using the step function with closure

Hello,

I am trying to use the step function(with the transformers and accelerate library) while passing the closure.

The step function has a decorator @torch.no_grad() and thus we specify enable_grad while calling the closure to compute gradients. How does the second call to closure() work? I have tried that and get the following error which sort of makes sense considering gradients will not be computed: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Here is the closure function I use:

def closure():
    tmp_ouput= model(**batch)
    tmp_loss = tmp_ouput.loss
    tmp_loss = tmp_loss / args.gradient_accumulation_steps
    accelerator.backward(tmp_loss)
    return accelerator

davda54 / sam

Using the step function with closure #90