Open sotirisnik opened 2 years ago
Yes, you're right---we always use the same \mu for all clients, and the loss is the global loss. But there is nothing preventing us from using client-specific \mu's in principle (but may be hard to determine in practice due to lack of enough data). More broadly, this can be viewed as a hyperparameter tuning problem, and we discuss some new approaches here.
Does the current implementation provide the option for heuristic μ as discussed in "C.3.3 Adaptively setting μ" from https://arxiv.org/pdf/1812.06127.pdf?
I assume that you mean that you use the same μ for all clients, and that you refer to the global loss, right?
Thank you