Closed bangawayoo closed 2 years ago
Hi! We at FedML have launched a new platform for FedNLP where this issue should not be there. Can you please check whether you face the same issue there? Here is the new FedNLP platform: https://github.com/FedML-AI/FedML/tree/master/python/app/fednlp
Hi, thanks for the great work.
When running
sh run_text_classification.sh FedOPT "niid_label_clients=100_alpha=100.0" 1e-3 0.1 1 4
, the process does not terminate automatically after the last round of training regardless of the number of communication rounds.The log stops after displaying the last eval metric
_18521 2021-12-29,21:14:53.265 - {tc_transformer_trainer.py (180)} - eval_model(): best_accuracy = 0.000000
18521 2021-12-29,21:14:53.266 - {tc_transformer_trainer.py (188)} - eval_model(): {'mcc': 0.0, 'tp': 0, 'tn': 0, 'fp': 0, 'fn': 0, 'acc': 0.0, 'evalloss': 3.01809245740279}
Commenting out
post_complete_message_to_sweep_process(self.args)
on ClientManger and ServerManger does abort the program, so it seems something with FIFO is the problem. Will commenting out the function cause any problem?Possibly related to an issue from FedML.