Explore federated learning for Cifar dataset

jeromechambost commented 4 years ago

We are used to having good performance results for MNIST dataset (often reaching >80% accuracy) independently from scenario configuration, which allows for good comparison of contributivity methods implemented

For Cifar dataset, results are more uncertain/variable and we need to find a few sets of configurations that give acceptable accuracy to be able to compare contributivity methods

Example of config leading to poor performance (early stopping after 5 epochs - max acc 34%...):

dataset_name:

'cifar10' partners_count:
3 amounts_per_partner:
[0.3, 0.3, 0.4] samples_split_option:
['advanced', [[7, 'shared'], [6, 'shared'], [2, 'specific']]] multi_partner_learning_approach:
'seqavg' aggregation_weighting:
'data_volume' epoch_count:
38 minibatch_count:
20 gradient_updates_per_pass_count:
8

Other possibility = changing early stopping conditions, such as increasing PATIENCE ?

aygalic commented 4 years ago

I like the idea of changing the value of patience according to the dataset.

I believe we might aswell go a step further and go for customizable callbacks as I proposed in #193.

RomainGoussault commented 4 years ago

Results for seqavg from an old notebook: https://github.com/SubstraFoundation/distributed-learning-contributivity/blob/master/saved_experiments/mnist_cifar10_distributed_learning/mnist_cifar10_distributed_learning.ipynb

Random split used to work very well but performance on stratified split was very poor.

arthurPignet commented 3 years ago

The PR283 set a new optimizer, which shown better results. However, I didn't test it with Sequential, nor stratified data.

LabeliaLabs / distributed-learning-contributivity

Explore federated learning for Cifar dataset #216