pliang279 / LG-FedAvg

[NeurIPS 2019 FL workshop] Federated Learning with Local and Global Representations
MIT License
230 stars 54 forks source link

The Results Reproduced About CIFAR-10 no-IID #1

Closed wnma3mz closed 4 years ago

wnma3mz commented 4 years ago

Thank you for the open source project. I think this is a very, very important step in federal learning, improving model performance while reducing communication parameters.

But when I use the command 'readme.md' python main_lgy.py --dataset cifar10 --model CNN --num_classes 10 --epochs 2000 --lr 0.1 --num_users 100 --frac 0.1 --local_ep 1 --local_bs 50 --num_layers_keep 2

I can't seem to get the precision of the results in the paper. (I didn't complete 2000 rounds)

Here are some of the results. This does not seem to reach the accuracy of about 89.66 of cifar-10 in Table1. And the New Test Acc has a similar problem.

Round 297, Avg Loss 0.322, Loss (local): 0.338, Acc (local): 84.78, Loss (Avg): 2.29, Acc (Avg): 11.21, Loss (ens) 2.151, Acc: (ens) 24.88,
Round 298, Avg Loss 0.295, Loss (local): 0.338, Acc (local): 84.80, Loss (Avg): 2.29, Acc (Avg): 10.87, Loss (ens) 2.156, Acc: (ens) 24.39,
Round 299, Avg Loss 0.363, Loss (local): 0.348, Acc (local): 84.51, Loss (Avg): 2.29, Acc (Avg): 10.42, Loss (ens) 2.170, Acc: (ens) 23.85,
Round 300, Avg Loss 0.324, Loss (local): 0.336, Acc (local): 84.95, Loss (Avg): 2.29, Acc (Avg): 10.53, Loss (ens) 2.157, Acc: (ens) 24.84,
Round 301, Avg Loss 0.403, Loss (local): 0.338, Acc (local): 85.08, Loss (Avg): 2.29, Acc (Avg): 10.80, Loss (ens) 2.154, Acc: (ens) 24.28,
Round 302, Avg Loss 0.345, Loss (local): 0.338, Acc (local): 85.25, Loss (Avg): 2.29, Acc (Avg): 11.06, Loss (ens) 2.154, Acc: (ens) 24.38,
Round 303, Avg Loss 0.395, Loss (local): 0.340, Acc (local): 85.22, Loss (Avg): 2.29, Acc (Avg): 10.48, Loss (ens) 2.166, Acc: (ens) 23.43,
Round 304, Avg Loss 0.404, Loss (local): 0.337, Acc (local): 85.24, Loss (Avg): 2.29, Acc (Avg): 10.19, Loss (ens) 2.165, Acc: (ens) 23.34,
Round 305, Avg Loss 0.343, Loss (local): 0.332, Acc (local): 85.44, Loss (Avg): 2.29, Acc (Avg): 10.79, Loss (ens) 2.161, Acc: (ens) 23.24,
Round 306, Avg Loss 0.281, Loss (local): 0.331, Acc (local): 85.54, Loss (Avg): 2.29, Acc (Avg): 10.93, Loss (ens) 2.157, Acc: (ens) 23.86,
Round 307, Avg Loss 0.253, Loss (local): 0.332, Acc (local): 85.57, Loss (Avg): 2.29, Acc (Avg): 10.88, Loss (ens) 2.147, Acc: (ens) 24.66,
Round 308, Avg Loss 0.413, Loss (local): 0.330, Acc (local): 85.61, Loss (Avg): 2.29, Acc (Avg): 11.07, Loss (ens) 2.148, Acc: (ens) 24.40,
Round 309, Avg Loss 0.287, Loss (local): 0.333, Acc (local): 85.44, Loss (Avg): 2.29, Acc (Avg): 10.67, Loss (ens) 2.151, Acc: (ens) 24.78,
Round 310, Avg Loss 0.343, Loss (local): 0.332, Acc (local): 85.44, Loss (Avg): 2.29, Acc (Avg): 10.79, Loss (ens) 2.146, Acc: (ens) 23.75,
Round 311, Avg Loss 0.355, Loss (local): 0.331, Acc (local): 85.44, Loss (Avg): 2.29, Acc (Avg): 10.70, Loss (ens) 2.154, Acc: (ens) 23.51,
Round 312, Avg Loss 0.326, Loss (local): 0.333, Acc (local): 85.34, Loss (Avg): 2.29, Acc (Avg): 10.56, Loss (ens) 2.151, Acc: (ens) 24.29,
Round 313, Avg Loss 0.264, Loss (local): 0.333, Acc (local): 85.37, Loss (Avg): 2.29, Acc (Avg): 10.57, Loss (ens) 2.155, Acc: (ens) 23.30,
Round 314, Avg Loss 0.349, Loss (local): 0.334, Acc (local): 85.36, Loss (Avg): 2.29, Acc (Avg): 10.43, Loss (ens) 2.158, Acc: (ens) 22.82,
Round 315, Avg Loss 0.327, Loss (local): 0.333, Acc (local): 85.20, Loss (Avg): 2.29, Acc (Avg): 10.83, Loss (ens) 2.155, Acc: (ens) 23.32,

Is this because Rounds is not enough? Or is there a problem with my understanding, And I hope someone can help.

Thank you again for your outstanding contribution

terranceliu commented 4 years ago

Hi, thank you for pointing this out. There is a mistake in the README For LG-FedAvg. You must use main_lg to finetune a model trained from main_fed.

I've been reorganizing the code to make it more straightforward to run everything, including loading a federated model for LG-Fedavg. Please check back at the end the next week, when I plan to push the new changes, as well as the new run command.

wnma3mz commented 4 years ago

Thank you for your reply. But I have a new question about Params Communicated inTable 1.

According to MNIST model output

# Params: 633226 (local), 99978 (global); Percentage 15.79 (99978/633226)

Then FEDAVG is 633226 * 750 * 10≈4.74e10 LG-FEDAVG is (99978 * 50 + 633226 * 400) * 10≈2.58e10

The above results are similar to the values in Table 1. But when communicating, isn't the parameter sent and recovered in two steps? I think it should be multiplied by 2 based on the result. Is there a problem with my understanding? Hope you can give me some suggestions! Thank you very much!