pliang279 / LG-FedAvg

[NeurIPS 2019 FL workshop] Federated Learning with Local and Global Representations
MIT License
230 stars 54 forks source link

Hi, I can't see local representations learning in main_lg.py. It just records the best local acc and the best local model, then updates the global model. #6

Open wardseptember opened 3 years ago

wardseptember commented 3 years ago

Hi, I can't see local representations learning in main_lg.py. It just records the best local acc and the best local model, then updates the global model. Did I understand it wrong? What did main_mtl.py do? I confuse it, looking forward you reply, thank you.

wardseptember commented 3 years ago

I think this code can't reduce the number of parameters. The code does not implement the ideas in your paper. Is there a error with my understanding?

terranceliu commented 3 years ago

Hi, just to clarify, the algorithm is designed to reduce the number of parameters communicated each round, and then we evaluate the local and global models on various test sets (local and new). I'm not entirely sure about what you are trying to achieve, but for our experiments, each local model will end up with different parameters (since only some layers are receiving global updates) and thus learn different local representations.

main_mtl contains an implementation for a separate method (fed multi-task learning) and is unrelated to our method.

wardseptember commented 3 years ago

Thank you for your reply. I understand it, in fact, some layers received global updates.

w_glob_keys = net_glob.weight_keys[total_num_layers - args.num_layers_keep:]
w_glob_keys = list(itertools.chain.from_iterable(w_glob_keys))

I didn't look at the code carefully before. thank you again.

TsingZ0 commented 3 years ago

I think this code can't reduce the number of parameters. The code does not implement the ideas in your paper. Is there a error with my understanding?

@wardseptember After carefully checking the code and the reply above, I still have the same doubt too. Did you understand it yet?

wardseptember commented 3 years ago

@TsingZ0 可以看下上面两行代码,只更新前面几层,并不是更新所有层参数。所以减少了交互参数

josebummer commented 3 years ago

I still have the same doubt as @wardseptember , in your algorithm you define the procedure with two different models, the local model (local representations), and the global model (to classify), which are trained independently in each client. However, in your code you simply train the complete model, and then separate the layers corresponding to the global model and the local model. Which of the two ways is really the correct one? Best regards.