LabeliaLabs / distributed-learning-contributivity

Simulate collaborative ML scenarios, experiment multi-partner learning approaches and measure respective contributions of different datasets to model performance.
https://www.labelia.org
Apache License 2.0
56 stars 12 forks source link

Federated averaging with local and global optimizations #345

Closed arthurPignet closed 2 years ago

arthurPignet commented 3 years ago

This PR follows the draft 235, which was too far behind master. This is a rebased version.

New mpl methods:

FedGDO stands for Federated Gradient Double Optimization.

This method is inspired from Federated gradient, but with modification on the local computation of the gradient. In this version we use a local optimizer (partner-specific) to do several minimization steps of the local-loss during a minibatch. We use the sum of these weighs-updates as the gradient which is sent to the global optimizer. The global optimizer aggregates these gradients-like which have been sent by the partners, and performs a optimization step with this aggregated gradient. Here three variations of this mpl method are tested.

The reset of the optimizers can be set via a parameter of the mpl methods. In the same way, a global optimizer different from the local ones can be passed via the mpl arguments.

These methods are tested on this notebook -> https://colab.research.google.com/drive/1CcQpWRpLGldj3iNR7v7Hv2brdwBP3z7D?usp=sharing

Please note that as I am currently working with the notebook, it can change, and be not fully readable Notebook access is currently limited to substra.org, but don't hesitate to come to me for access.

codecov-commenter commented 3 years ago

Codecov Report

Merging #345 (40379dc) into master (ecc3ea8) will decrease coverage by 0.38%. The diff coverage is 55.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #345      +/-   ##
==========================================
- Coverage   80.68%   80.30%   -0.39%     
==========================================
  Files          15       15              
  Lines        3045     3092      +47     
==========================================
+ Hits         2457     2483      +26     
- Misses        588      609      +21     
Impacted Files Coverage Δ
mplc/multi_partner_learning/__init__.py 100.00% <ø> (ø)
mplc/multi_partner_learning/fast_mpl.py 61.09% <55.31%> (-0.81%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ecc3ea8...40379dc. Read the comment docs.

bowni commented 3 years ago

@RomainGoussault proposes to review it!

bowni commented 2 years ago

@arthurPignet let us know when you can tackle @RomainGoussault 's questions and comments! 😃

bowni commented 2 years ago

@HeytemBou beyond your above comment and the associated thread, was your review conclusive? Are you confortable with validating the PR or not?

HeytemBou commented 2 years ago

@HeytemBou beyond your above comment and the associated thread, was your review conclusive? Are you confortable with validating the PR or not?

Yes

arthurPignet commented 2 years ago

@arthurPignet let us know when you can tackle @RomainGoussault 's questions and comments! 😃

I took into account @RomainGoussault's comments, it's ready for a final (well I hope it will be) review !