jeandut commented 2 years ago

One should add (if it does not exist yet) a folder called strategies: FLamby/flamby/strategies
Inside this folder you should have a file called strategy_name.py (aka fed_avg.py/fed_yogi.py, scaffold.y). This file should contain a class StrategyName with the following methods and init:
```
import torch
import copy
from tqdm import tqdm
```

class StrategyName(): def init(self, training_dataloaders: list[torch.dataset], model: torch.nn.Module, loss: torch.nn.modules.loss._Loss, learning_rate: float, nrounds: int, additional_strategy_specific_parameters): self.training_dataloaders = training_dataloaders self.modelslist = [copy.deepcopy(model) for in range(len(training_dataloaders))] self.loss = loss self.lr = learning_rate self.nrounds = nrounds def perform_round(self):

do stuff and update models

def run(self):
    for _ in tqdm(range(self.nrounds)):
        self.perform_round()
    return self.models_list[0]

What is important are the signature of the methods and the attributes names. In order to monitor bits exchanged the clients outputs at each round should be clearly visible so that we can do PRs to add bits monitoring easily.
This class should be tested on one of the datasets (either ISIC or Camelyon16 as LIDC is a bit computation intensive).
By doing something like:
```python
from flamby.datasets.fed_isic2019 import FedIsic2019, Baseline, BaselineLoss, LR, get_nb_max_rounds, BATCH_SIZE, NUM_CLIENTS, metric
from flamby.utils import evaluate_model_on_tests
from torch.utils.data import DataLoader as dl

training_dls = [dl(FedIsic2019(train=True, center=i), batch_size=BATCH_SIZE, shuffle=True, num_workers=10) for i in range(NUM_CLIENTS)]
test_dls = [dl(FedIsic2019(train=False, center=i), batch_size=BATCH_SIZE, shuffle=False, num_workers=10) for i in range(NUM_CLIENTS)]
loss = BaselineLoss()
m = Baseline()
NUM_UPDATES = 50
nrounds = get_nb_max_rounds(NUM_UPDATES)
s = StrategyName(training_dls, m, loss, LR,  nrounds)
m = s.run()
print(evaluate_model_on_tests(m, test_dls, metric))

omarfoq commented 2 years ago

As some parts of the code are common to all strategies (e.g., training/evaluation loops, computing the average of models), may be we should factorize those elements. I suggest that we have a file utils.py in FLamby/flamby/strategies with all training utilities. What do you think?

jeandut commented 2 years ago

Thanks @omarfoq ! You are not wrong theoretically. In practice as we are working with lots of people asynchronously it might be hard to achieve a perfect synergy. Let's see what people come up with for the first strategies and we'll see how much we can factorize the code. Note that for evaluation there is already a util.

jeandut commented 2 years ago

FedAvg is now implemented in the repository. There is a utils.py file in the strategies folder where we define two abstractions that might or not be useful for your strategies as well as @omarfoq hinted at.

owkin / FLamby

[Newcomers] Guide on adding a new strategy #44

do stuff and update models