Error for capture_activations_hook in grad_sample_module.py

conjurer-Fan-Wu commented 8 months ago

🐛 Bug

Error seems to be in the opacus library File ~/.local/lib/python3.10/site-packages/opacus/grad_sample/grad_sample_module.py:288 in capture_activations_hook p._forward_counter += 1 AttributeError: 'Parameter' object has no attribute '_forward_counter'

Please reproduce using our template Colab and post here the link

I use Google Drive to store the files. federated_main.py is the main file when I run with spyders. All py files are in src v3 filefolder

https://drive.google.com/drive/folders/1inWFXO0fPoKygi8rJSzUcJLr-jFVoLxb?usp=sharing

To Reproduce

:warning: We cannot help you without you sharing reproducible code. Do not ignore this part :) Steps to reproduce the behavior:

Run federated_main directly

Traceback (most recent call last):

File /usr/local/lib/python3.10/dist-packages/spyder_kernels/py3compat.py:356 in compat_exec exec(code, globals, locals)

File ~/work/pyproject/basictest/FL_testmine/src_v3/federated_main.py:237 model0, optimizer0, train_loader = privacy_engine.make_private(

TypeError: PrivacyEngine.make_private() missing 1 required keyword-only argument: 'data_loader'

runfile('/home/fanwu/work/pyproject/basictest/FL_testmine/src_v3/federated_main.py', wdir='/home/fanwu/work/pyproject/basictest/FL_testmine/src_v3') Reloaded modules: options, update, models, sampling, utils

Experimental details: Model : cnn Optimizer : sgd Learning : 0.01 Global Rounds : 2

Federated parameters:
IID
Fraction of users  : 0.9
Local Batch size   : 64
Local Epochs       : 5

global model: CNNMnist( (conv1): Conv2d(1, 16, kernel_size=(8, 8), stride=(2, 2), padding=(3, 3)) (conv2): Conv2d(16, 32, kernel_size=(4, 4), stride=(2, 2)) (fc1): Linear(in_features=512, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) ) global model: CNNMnist( (conv1): Conv2d(1, 16, kernel_size=(8, 8), stride=(2, 2), padding=(3, 3)) (conv2): Conv2d(16, 32, kernel_size=(4, 4), stride=(2, 2)) (fc1): Linear(in_features=512, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) ) 0%| | 0/2 [00:00<?, ?it/s] | Global Training Round : 1 |

/home/fanwu/work/pyproject/basictest/FL_testmine/src_v3/update.py:25: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). return torch.tensor(image), torch.tensor(label) 0%| | 0/2 [00:01<?, ?it/s] Traceback (most recent call last):

File /usr/local/lib/python3.10/dist-packages/spyder_kernels/py3compat.py:356 in compat_exec exec(code, globals, locals)

File ~/work/pyproject/basictest/FL_testmine/src_v3/federated_main.py:245 w, loss, epsilon_idx = local_model.update_weights(args=args,

File ~/work/pyproject/basictest/FL_testmine/src_v3/update.py:79 in update_weights log_probs = model(images)

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1518 in _wrapped_call_impl return self._call_impl(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1568 in _call_impl result = forward_call(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/opacus/grad_sample/grad_sample_module.py:148 in forward return self._module(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1518 in _wrapped_call_impl return self._call_impl(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1527 in _call_impl return forward_call(*args, **kwargs)

File ~/work/pyproject/basictest/FL_testmine/src_v3/models.py:49 in forward x = F.relu(self.conv1(x)) # -> [B, 16, 14, 14]

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1518 in _wrapped_call_impl return self._call_impl(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1581 in _call_impl hook_result = hook(self, args, result)

File ~/.local/lib/python3.10/site-packages/opacus/grad_sample/grad_sample_module.py:288 in capture_activations_hook p._forward_counter += 1

AttributeError: 'Parameter' object has no attribute '_forward_counter'

Expected behavior

At least the program should normally run.

Environment

Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:


wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
By the way, I use ubuntu 22.04 with python3 3.10.12 and opacus 1.4.0

[pip3] flake8==6.0.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.3
[pip3] numpydoc==1.5.0
[pip3] torch==2.1.0
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.15.2
[pip3] triton==2.1.0
[conda] Could not collect

## Additional context

<!-- Add any other context about the problem here. -->

HuanyuZhang commented 8 months ago

It seems your model has not been successfully instantiated by "make_private". Thus, the "_forward_counter" has not been defined (https://github.com/pytorch/opacus/blob/95df0904ae5d2b3aaa26b708e5067e9271624036/opacus/grad_sample/gsm_base.py#L67). Furthermore, the error message shows "TypeError: PrivacyEngine.make_private() missing 1 required keyword-only argument: 'data_loader'", which might be the reason for the failed instantiation. Could you please fix that part first? thanks!

conjurer-Fan-Wu commented 7 months ago

I have tested the code again, and eliminated the dataloader problem. But the above problem still exists.

##################################

runfile('/home/fanwu/work/pyproject/basictest/FL_testmine/src_v3/federated_main.py', wdir='/home/fanwu/work/pyproject/basictest/FL_testmine/src_v3')

Experimental details: Model : cnn Optimizer : sgd Learning : 0.01 Global Rounds : 2

Federated parameters:
IID
Fraction of users  : 0.9
Local Batch size   : 64
Local Epochs       : 5

global model: CNNMnist( (conv1): Conv2d(1, 16, kernel_size=(8, 8), stride=(2, 2), padding=(3, 3)) (conv2): Conv2d(16, 32, kernel_size=(4, 4), stride=(2, 2)) (fc1): Linear(in_features=512, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) ) global model: CNNMnist( (conv1): Conv2d(1, 16, kernel_size=(8, 8), stride=(2, 2), padding=(3, 3)) (conv2): Conv2d(16, 32, kernel_size=(4, 4), stride=(2, 2)) (fc1): Linear(in_features=512, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) ) 0%| | 0/2 [00:00<?, ?it/s] | Global Training Round : 2 |

/home/fanwu/.local/lib/python3.10/site-packages/opacus/privacy_engine.py:142: UserWarning: Secure RNG turned off. This is perfectly fine for experimentation as it allows for much faster training performance, but remember to turn it on and retrain one last time before production with secure_mode turned on. warnings.warn( /home/fanwu/work/pyproject/basictest/FL_testmine/src_v3/update.py:25: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). return torch.tensor(image), torch.tensor(label) 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last):

File ~/.local/lib/python3.10/site-packages/spyder_kernels/py3compat.py:356 in compat_exec exec(code, globals, locals)

File ~/work/pyproject/basictest/FL_testmine/src_v3/federated_main.py:244 w, loss, epsilon_idx = local_model.update_weights(args=args,