JedMills / MTFL-For-Personalised-DNNs

Code for 'Multi-Task Federated Learning for Personalised Deep Neural Networks in Edge Computing', published in IEEE TPDS.
86 stars 20 forks source link

Pytorch version #8

Closed MayarAlfares closed 2 years ago

MayarAlfares commented 2 years ago

Hi Jed,

When using the current torch version, this error only occurs with the algorithms using adam.

File ".../fl_algs.py", line 213, in run_fedavg round_opt_agg = round_opt_agg + (client_opt.get_params() * w) File ".../models.py", line 498, in add return self._op(other, operator.add) File ".../models.py", line 475, in _op new_params = [f(p, o) for (p, o) in zip(self.params, other.params)] File ".../models.py", line 475, in new_params = [f(p, o) for (p, o) in zip(self.params, other.params)] TypeError: Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.

Any idea?

MayarAlfares commented 2 years ago

Solution: round_opt_agg = (client_opt.get_params() * w) + round_opt_agg

JedMills commented 2 years ago

Hi Mayar,

Thanks for raising the issue. Please could you post the argument string that you used to generate the error?

What version of pytorch are you using? The code was developed with 1.7.0, and I'm currently using 1.8.2 also with no problems.

Jed

MayarAlfares commented 2 years ago

I'm using the current latest version 1.12

The argument string is the fedavg-adam in the readme:

python main.py -dset cifar10 -alg fedavg-adam -C 0.5 -B 20 -T 500 -E 1 -device gpu -W 400 -seed 0 -lr 0.003 -noisy_frac 0.0 -bn_private usyb -beta1 0.9 -beta2 0.999 -epsilon 1e-7

Another issue arises though: File ".../models.py", line 191, in train_step self.optim.step() File ".../optimizer.py", line 113, in wrapper return func(*args, *kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/optim/adam.py", line 171, in step capturable=group['capturable']) File "/usr/local/lib/python3.7/dist-packages/torch/optim/adam.py", line 199, in adam raise RuntimeError("API has changed, state_steps argument must contain a list of singleton tensors") RuntimeError: API has changed, state_steps argument must contain a list of singleton tensors

JedMills commented 2 years ago

Hi Mayar,

I've tried running the code with the arguments you posted and haven't had any problems. Could you try downgrading to pytorch 1.7.0 or 1.8.2 and see if the problem persists?

Jed

MayarAlfares commented 2 years ago

Hi Jed,

This problem only appears with later versions

JedMills commented 2 years ago

Hi Mayar,

Thanks for letting me know. In which case, my advice would be to continue using the older version.

Warm regards, Jed