woodenchild95 / FL-Simulator

Pytorch implementations of some general federated optimization methods.
37 stars 6 forks source link

about FedProx #12

Closed KevinM1ao closed 7 months ago

KevinM1ao commented 7 months ago

Hi, I have a bit of a question about the implementation part of your fedprox algorithm, could you please answer it?

1713166293159 1713166501723

In the paper, the loss function is followed by an L2 regularization term for the current model and the global model, but it seems that their L2 regularization is not computed in your code:torch.sum(param_list delta_list) -1.

woodenchild95 commented 7 months ago

Hi, I have a bit of a question about the implementation part of your fedprox algorithm, could you please answer it?

1713166293159 1713166501723

In the paper, the loss function is followed by an L2 regularization term for the current model and the global model, but it seems that their L2 regularization is not computed in your code:torch.sum(param_list delta_list) -1.

Hi, @KevinM1ao, as the problem is defined: $\min_w h = f(w) + \frac{\mu}{2}||w - w^t||^2 = \min_w f(w) + \frac{\mu}{2}||w||^2 + \mu\langle w, w^t \rangle$. After reconstruction, $\frac{\mu}{2}||w||^2$ is the regularization loss and $\mu\langle w, w^t \rangle$ is the loss in the red blank above. The first loss is added to the weight decay in the local optimizer, and the second loss is constructed above. You can also implement a new optimizer class to perform the $w = w - \eta(g + \mu(w-w^t))$. They are equal.

KevinM1ao commented 7 months ago

thanks so much!