pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.65k stars 328 forks source link

IndexError: pop from empty list #624

Closed shanjin2014 closed 1 month ago

shanjin2014 commented 5 months ago

🐛 Bug

I am trying to replicate the idea in one paper: First only set the last fc layer in the model to be requires_grad = true and other layers are false. And then after getting the loss, using loss.backward() to get the gradients (aggregated and noise added) only for the fc layers. Then, based on those gradients, by building linear equations, to estimate the graidents of loss wrt logits, and finally using the estimated gradients to update the whole model.

I have tested the same idea in pytroch, without noise added on fc layer but solving the linear equations, and it works. But by using opacus to add noise on the fc layer, I got the error. Is this a bug or other issue? is this caused by multiple backward?

The error is caused by the line: logits. backward (dLdZ)

Please reproduce using our template Colab and post here the link

To Reproduce

:warning: We cannot help you without you sharing reproducible code. Do not ignore this part :) Steps to reproduce the behavior:

inputs, labels = data optimizer. zero_grad()

logits, inoutput = model (inputs) # inoutput is the intermediate output before the last layer of the model loss = criterion (logits, labels)

loss. backward(retain_graph=True)

fc_params = model.fc.parameters( ) grads_fc =[]

for param in fc.params: if param.grad is not None: if len (param.grad.shape) > 1: fc_grad = param.grad.view(param.grad.size(0), -1) else: fc_grad = param. grad. unsqueeze (1) grads_fc. append (fc_grad)

dLdW = torch. cat (grads_fc, dim=1) dZdB = torch.ones ( (inoutput.size(0), 1)).to(device) dZdW = torch.cat ( (inoutput, dZdB), dim=1)

A = dZdW. t() B = dLdW.t() dLdZ = torch. linalg.lstsq(A, B) .solution

logits. backward (dLdZ) optimizer.step ( )

Environment

Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

Additional context

HuanyuZhang commented 3 months ago

It is not quite common for Opacus to deal with such complex operations. A qq: what is the reason of using Opacus here, given you do not need access to per_sample_gradient? How about directly adding noise to the aggregated gradient?