woodenchild95 / FL-Simulator

Pytorch implementations of some general federated optimization methods.
34 stars 5 forks source link

Confused by the aggregation method #11

Closed xxdznl closed 5 months ago

xxdznl commented 5 months ago

Thanks for your opensource work! @woodenchild95 But I am very confused by the aggregation method。 To my knowledge,the most used aggregation method is either “only aggregate parameters” or “only ”aggregate updated parameters”。like self.server_model_params_list + self.args.global_learning_rate * Averaged_update or only the “Averaged_model ” I didn‘t understand why do it both,Is there any reasonable explanation for this? Averaged_model + torch.mean(self.h_params_list, dim=0) Or it is the ADMM(Sorry i am not familiar with ADMM) method of aggregation。

woodenchild95 commented 5 months ago

Thanks for your opensource work! @woodenchild95 But I am very confused by the aggregation method。 To my knowledge,the most used aggregation method is either “only aggregate parameters” or “only ”aggregate updated parameters”。like self.server_model_params_list + self.args.global_learning_rate * Averaged_update or only the “Averaged_model ” I didn‘t understand why do it both,Is there any reasonable explanation for this? Averaged_model + torch.mean(self.h_params_list, dim=0) Or it is the ADMM(Sorry i am not familiar with ADMM) method of aggregation。

@xxdznl Hi, the highlighted code performs the aggregation for the dual variables in ADMM method which is a classic optimization algorithm. The dual variable is a specific auxiliary variable in this method, used to construct a specific local Lagrangian objective for effectively solving constrained optimization problems. More details can be learned from [1,2,3,4,5,6]. If you are not familiar with ADMM itself, you can start by understanding primal-dual optimization and the method of alternating multipliers. The aggregation calculation of dual variables is derived from the consistency constraints in the optimization process.

[1] FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data

[2] FedADMM: A Federated Primal-Dual Algorithm Allowing Partial Participation

[3] FedADMM: A Robust Federated Deep Learning Framework with Adaptivity to System Heterogeneity

[4] Federated Learning Based on Dynamic Regularization

[5] FedADMM-InSa: An Inexact and Self-Adaptive ADMM for Federated Learning

[6] Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

xxdznl commented 5 months ago

It's an honor to get your reply so quickly and concretely. Sure I will read the above article carefully. Thanks!