alibaba / FederatedScope

An easy-to-use federated learning platform
https://www.federatedscope.io
Apache License 2.0
1.3k stars 210 forks source link

add retry when loss is NaN in train and finetune #672

Closed rayrayraykk closed 1 year ago

rayrayraykk commented 1 year ago

@HarliWu Please hv a look at this change. To maintain the consistency of the experiment, we do not merge this PR for the time being.

rayrayraykk commented 1 year ago

https://github.com/alibaba/FederatedScope/pull/697