Closed Arnou1 closed 2 months ago
Actually, it confuses me too. I think the poor performance of FedAvgCNN is attributed to the lack of normalization. Normalization really help improve model training. But in the original paper, I don't see any normalization in FedAvgCNN's inference.
So if full reproduction is not your goal, maybe you can try adding normalization layers in the model or just switch to using other models.
I just added some data augmentation to the CIFAR10 dataset and it improved the test accuracy by around 15%. I will now experiment with normalization layers. Thanks!
Like the similar question earlier, I am struggling with reproducing the results on CIFAR10 as described in the original FedAvg paper.
Though many hyperparameter combinations have been attempted, the best result I have ever got so far is at around 60% accuracy (IID data on 100 clients with FedAvg), which is way worse than what is reported in the paper. I only trained the model for 300 rounds. However, with the trend shown in the figure down below, I doubt there would be any significant improvement even if the model is trained longer.
I have attached my configurations. Any suggestions on improving the test results would be greatly appreciated. Thanks.