FedML-AI / FedGraphNN

FedGraphNN: A Federated Learning Platform for Graph Neural Networks with MLOps Support. The previous research version is accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.
https://arxiv.org/abs/2104.07145
180 stars 42 forks source link

The distributed experiment was stuck after creating model done #12

Open csshali opened 2 years ago

csshali commented 2 years ago

I ran run_fedavg_distributed_pytorch but the experiment was stuck after creating the model done. What's wrong?

2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 0, local sample number = 191 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 1, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 2, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 3, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 4, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 5, local sample number = 190 2022-04-10,23:37:38.904 - {main_fedavg.py (139)} - create_model(): create_model. model_name = graphsage, output_dim = None 2022-04-10,23:37:38.929 - {main_fedavg.py (180)} - create_model(): done

Screenshot 2022-04-10 at 7 38 41 PM
Lin-repository commented 2 years ago

same question

chaoyanghe commented 2 years ago

@csshali @csshali Thank you for the feedback. I've fixed this issue. Please try to update the latest source code.:

https://github.com/FedML-AI/FedGraphNN/commit/5b3c766baa48efdde7aec658eaabb6d0eb19386f

Lin-repository commented 2 years ago

finally found the reason, it is stuck caused by wandb not being configured,thanks to the questioner and author