FL-Simulator

Pytorch implementations of some general optimization methods in the federated learning community.

Basic Methods

FedAvg: Communication-Efficient Learning of Deep Networks from Decentralized Data

FedProx: Federated Optimization in Heterogeneous Networks

FedAdam: Adaptive Federated Optimization

SCAFFOLD: SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

FedDyn: Federated Learning Based on Dynamic Regularization

FedCM: FedCM: Federated Learning with Client-level Momentum

FedSAM/MoFedSAM: Generalized Federated Learning via Sharpness Aware Minimization

FedGamma: Fedgamma: Federated learning with global sharpness-aware minimization

FedSpeed: FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

FedSMOO: Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

Usage

Training

FL-Simulator works on one single CPU/GPU to simulate the training process of federated learning (FL) with the PyTorch framework. If you want to train the centralized-FL with FedAvg method on the ResNet-18 and Cifar-10 dataset (10% active clients per round of total 100 clients, and heterogeneous dataset split is Dirichlet-0.6), you can use:

python train.py --non-iid --dataset CIFAR-10 --model ResNet18 --split-rule Dirichlet --split-coef 0.6 --active-ratio 0.1 --total-client 100

Other hyperparameters are introduced in the train.py file.

How to implement your own method?

FL-Simulator pre-define the basic Server class and Client class, which are executed according to the vanilla $FedAvg$ algorithm. If you want define a new method, you can define a new server file first with:

process_for_communication( ): how your method pre-processes the variables for communication to each client
postprocess( ): how your method post-processes the received variables from each local client
global_update( ): how your method processes the update on the global model

Then you can define a new client file or a new local optimizer for your own method to perform the local training. Similarly, you can directly define a new server class to rebuild the inner-operations.

Some Experiments

We show some results of the ResNet-18-GN model on the CIFAR-10 dataset. The corresponding hyperparameters are stated in the following. The time costs are tested on the NVIDIA® Tesla® V100 Tensor Core.

	CIFAR-10 (ResNet-18-GN) T=1000
	10%-100 (bs=50 Local-epoch=5)					5%-200 (bs=25 Local-epoch=5)
	IID	Dir-0.6	Dir-0.3	Dir-0.1	Time / round	IID	Dir-0.6	Dir-0.3	Dir-0.1	Time / round
	SGD basis
FedAvg	82.52	80.65	79.75	77.31	15.86s	81.09	79.93	78.66	75.21	17.03s
FedProx	82.54	81.05	79.52	76.86	19.78s	81.56	79.49	78.76	75.84	20.97s
FedAdam	84.32	82.56	82.12	77.58	15.91s	83.29	81.22	80.22	75.83	17.67s
SCAFFOLD	84.88	83.53	82.75	79.92	20.09s	84.24	83.01	82.04	78.23	22.21s
FedDyn	85.46	84.22	83.22	78.96	20.82s	81.11	80.25	79.43	75.43	22.68s
FedCM	85.74	83.81	83.44	78.92	20.74s	83.77	82.01	80.77	75.91	21.24s
	SAM basis
FedGamma	85.74	84.80	83.81	80.72	30.13s	84.99	84.02	83.03	80.09	33.63s
MoFedSAM	87.24	85.74	85.14	81.58	29.06s	86.27	84.71	83.44	79.02	32.45s
FedSpeed	87.31	86.33	85.39	82.26	29.48s	86.87	85.07	83.94	79.66	33.69s
FedSMOO	87.70	86.87	86.04	83.30	30.43s	87.40	85.97	85.14	81.35	34.80s

The blank parts are awaiting updates.

Some key hyparameters selection

	local Lr	global Lr	Lr decay	SAM Lr	proxy coefficient	client-momentum coefficiet
FedAvg	0.1	1.0	0.998	-	-	-
FedProx	0.1	1.0	0.998	-	0.1 / 0.01	-
FedAdam	0.1	0.1 / 0.05	0.998	-	-	-
SCAFFOLD	0.1	1.0	0.998	-	-	-
FedDyn	0.1	1.0	0.9995 / 1.0	-	0.1	-
FedCM	0.1	1.0	0.998	-	-	0.1
FedGamma	0.1	1.0	0.998	0.01	-	-
MoFedSAM	0.1	1.0	0.998	0.1	-	0.05 / 0.1
FedSpeed	0.1	1.0	0.998	0.1	0.1	-
FedSMOO	0.1	1.0	0.998	0.1	0.1	-

The hyperparameter selections above are for reference only. Each algorithm has unique properties to match the corresponding hyperparameters. In order to facilitate a relatively fair comparison, we report a set of selections that each method can perform well in general cases. Please adjust the hyperparameters according to changes in the different model backbones and datasets.

ToDo

[ ] Decentralized Implementation
[ ] Delayed / Asynchronous Implementation
[x] Hyperparameter Selections
[x] Related Advances (Long-Term)

Citation

If this codebase can help you, please cite our papers:

FedSpeed (ICLR 2023):

@article{sun2023fedspeed,
  title={Fedspeed: Larger local interval, less communication round, and higher generalization accuracy},
  author={Sun, Yan and Shen, Li and Huang, Tiansheng and Ding, Liang and Tao, Dacheng},
  journal={arXiv preprint arXiv:2302.10429},
  year={2023}
}

FedSMOO (ICML 2023 Oral):

@inproceedings{sun2023dynamic,
  title={Dynamic regularized sharpness aware minimization in federated learning: Approaching global consistency and smooth landscape},
  author={Sun, Yan and Shen, Li and Chen, Shixiang and Ding, Liang and Tao, Dacheng},
  booktitle={International Conference on Machine Learning},
  pages={32991--33013},
  year={2023},
  organization={PMLR}
}

woodenchild95 / FL-Simulator

readme