when i train to epoch 1464 , it always stuck

jhcknzzm / Federated-Learning-Backdoor

ICML 2022 code for "Neurotoxin: Durable Backdoors in Federated Learning" https://arxiv.org/abs/2206.10341

61 stars 7 forks source link

when i train to epoch 1464 , it always stuck #8

Closed imomoe233 closed 2 years ago

imomoe233 commented 2 years ago

when i train to epoch 1464,it always stuck like that.... but i dont know why,how can i resolve it?Did you in this case before?

f2d7dcc8ff2904a7ff9af9f2db5354a

imomoe233 commented 2 years ago

what should i do ? please help : (

jhcknzzm commented 2 years ago

I haven't had this problem before. However, I guess it is because when users are randomly sampled, the data of a sampled user is very small, and the condition of line 277 in the train_funcs.py cannot be satisfied, so the loss is not calculated. Maybe you can comment out the if statement on lines 277-278 in the train_funcs.py.

jhcknzzm commented 2 years ago

You can also initialize loss=0.0 before the loop on line 274 in the train_funcs.py.

imomoe233 commented 2 years ago

thank you for your reply , i modify the batch_size: 20 to 15 && participant_population: 8000 to 6000 in words_reddit_lstm.yaml and then the issue be resolved.