New mpl method : gradients fusion

Add a new method to do federated learning, where we compute gradients for each partner, we aggregate the gradient, and then we use the fusionned gradient to optimize the main model.

EDIT: The interest of aggregate the gradient instead if aggregating the weight is to perform the optimization step with the aggreged gradient, so after the aggregation step, at the opposite of fedavg, which make the optimization step, and then the aggregation step. With standard optimizer such as SGD or batch gradient descente, aggregation and fusion step commute. But with more complexe algorithms (which involve momentum, or adaptative learning rate) like bfgs or adam, the order of these two steps matter.

EDIT: This version of tensorflow/keras does not allow anymore to use modified gradients for the optimization step. see the error >>> self.optimizer.apply_gradients(zip(fusion_gradients, model.trainable_weights)) AttributeError: 'Adam' object has no attribute 'apply_gradients'

It is really strange, as it is the implementation suggested by the documentation In order to bypass this problem, I will try to fusion the loss and use the .get_updates() method, on the aggregated loss (it should directly compute the aggregated gradient)

EDIT: I failed to use the get_updates() method, I am pretty sure that we need the apply_gradient function... So I am pausing this work, until keras update.

Signed-off-by: arthurPignet arthur.pignet@mines-paristech.fr

LabeliaLabs / distributed-learning-contributivity

New mpl method : gradients fusion #271