LabeliaLabs / distributed-learning-contributivity

Simulate collaborative ML scenarios, experiment multi-partner learning approaches and measure respective contributions of different datasets to model performance.
https://www.labelia.org
Apache License 2.0
56 stars 12 forks source link

Explore federated learning for Cifar dataset #216

Open jeromechambost opened 4 years ago

jeromechambost commented 4 years ago

We are used to having good performance results for MNIST dataset (often reaching >80% accuracy) independently from scenario configuration, which allows for good comparison of contributivity methods implemented

For Cifar dataset, results are more uncertain/variable and we need to find a few sets of configurations that give acceptable accuracy to be able to compare contributivity methods

Example of config leading to poor performance (early stopping after 5 epochs - max acc 34%...):

dataset_name:

Other possibility = changing early stopping conditions, such as increasing PATIENCE ?

aygalic commented 4 years ago

I like the idea of changing the value of patience according to the dataset.

I believe we might aswell go a step further and go for customizable callbacks as I proposed in #193.

RomainGoussault commented 4 years ago

Results for seqavg from an old notebook: https://github.com/SubstraFoundation/distributed-learning-contributivity/blob/master/saved_experiments/mnist_cifar10_distributed_learning/mnist_cifar10_distributed_learning.ipynb

image

Random split used to work very well but performance on stratified split was very poor.

arthurPignet commented 3 years ago

The PR283 set a new optimizer, which shown better results. However, I didn't test it with Sequential, nor stratified data.