Closed cuiboyuan closed 2 years ago
Thanks for pointing out the bug and providing the details. The self.smart_weighting
is always a vector of ten 0.1 for each round at the beginning of the training because the agent adopts the FedAvg aggregation policy in FEI's design, and the parameter algorithm:start_steps
in yaml config file determines how many steps the agent conducts this policy prior to the RL policy. In the given config file, the numbers of data samples are the same over the clients, so the weighting will be evenly divided from 1 according to the FedAvg algorithm.
As for the bug, after I reproduced it, I conjectured that the error has something to do with the config file fei_FashionMNIST_lenet5.yml
you used with, which may not be compatible with the latest version of the framework due to certain parameter settings. And this is my fault that I failed to keep things updated on GitHub and to maintain a clear documentation of using FEI. I updated some example config files I'm currently using under directory examples/fei/
(for training FEI from scratch). The index error should not occur again with these config files. Also, I set data:variable_partition
to true there, so even at the beginning of the training with FedAvg aggregation policy, you're expected to see a more uneven value of self.smart_weighting
.
Thank you for the detailed explanation. The error is now resolved. Thanks!
Describe the bug When running
examples/fei.py
with configfei_FashionMNIST_lenet5.yml
, an IndexError was raised inrl_server.py
when the server tried to the federated averaging in the first round.To Reproduce Steps to reproduce the behavior:
python examples/fei/fei.py -c fei_FashionMNIST_lenet5.yml
Expected behavior No error should be raised
Screenshots The following snippet is the Traceback of the error
OS environment (please complete the following information):
Additional context I tried to remove the
[0]
at the end of line 84 inrl_server.py
and the program seems to proceed normally without errors, but I checked for the value ofself.smart_weighting
and it is always a vector of ten0.1
for each round. I'm unsure whether that is the expected behavior.