Closed Elhamnazari1372 closed 5 months ago
Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.
☝ finetune_epoch
is set to 0
in template.yml
https://github.com/KarhouTam/FL-bench/blob/b19d9350dc73496e7b85372061fea4be91505e8d/config/template.yml#L24
This issue is closed due to long time no response.
I changed as your recommend but got the same results. Seems still not running the finetune.
Sorry for my late respone. What's your run command? If you set finetune_epoch, you need to specify the config file in the command like python main.py fedavg your_config.yml
I use the same command as you mentioned . my config is :
mode: parallel # [serial, parallel]
parallel: # It's fine to keep these configs.
https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html
for more details.ray_cluster_addr: null # [null, auto, local]
null
implies that all cpus/gpus are included.num_cpus: null num_gpus: null
serial
num_workers
can further boost efficiency, also let each worker have less computational resources.num_workers: 2
common: dataset: mnist seed: 42 model: lenet5 join_ratio: 0.1 global_epoch: 100 local_epoch: 5 finetune_epoch: 20 batch_size: 32 test_interval: 100 straggler_ratio: 0 straggler_min_local_epoch: 0 external_model_params_file: "" optimizer: name: sgd # [sgd, adam, adamw, rmsprop, adagrad] lr: 0.01 dampening: 0 # SGD weight_decay: 0 momentum: 0 # [SGD, RMSprop] alpha: 0.99 # RMSprop nesterov: false # SGD betas: [0.9, 0.999] # [Adam, AdamW] amsgrad: false # [Adam, AdamW]
lr_scheduler: name: step # null for deactivating step_size: 10
eval_test: true eval_val: false eval_train: false
verbose_gap: 10 visible: false use_cuda: true save_log: true save_model: false save_fig: true save_metrics: true check_convergence: true
fedprox: mu: 0.01 pfedsim: warmup_round: 0.7
get_<method>_args()
in src/server/<method>.py
I tested on my workspace and everything is fine.
Here is the result, config, commands to reproduce it:
==================== FedAvg Experiment Results: ====================
Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.
{100: {'all_clients': {'test': {'loss': '0.3364 -> 0.3116', 'accuracy': '91.44% -> 92.18%'}}}}
========== FedAvg Convergence on train clients ==========
test (before local training):
10.0%(11.65%) at epoch: 0
20.0%(27.31%) at epoch: 3
30.0%(35.33%) at epoch: 4
40.0%(47.46%) at epoch: 5
60.0%(63.21%) at epoch: 7
70.0%(75.43%) at epoch: 9
80.0%(86.50%) at epoch: 18
90.0%(90.34%) at epoch: 37
test (after local training):
80.0%(82.13%) at epoch: 0
90.0%(91.06%) at epoch: 1
==================== FedAvg Max Accuracy ====================
all_clients:
(test) before fine-tuning: 91.44% at epoch 100
(test) after fine-tuning: 92.18% at epoch 100
# cfg.yml
mode: parallel # [serial, parallel]
parallel: # It's fine to keep these configs.
# Go check doc of `https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html` for more details.
ray_cluster_addr: null # [null, auto, local]
# `null` implies that all cpus/gpus are included.
num_cpus: null
num_gpus: null
# should be set larger than 1, or training mode fallback to `serial`
# Set a larger `num_workers` can further boost efficiency, also let each worker have less computational resources.
num_workers: 2
common:
dataset: mnist
seed: 42
model: lenet5
join_ratio: 0.1
global_epoch: 100
local_epoch: 5
finetune_epoch: 5
batch_size: 32
test_interval: 100
straggler_ratio: 0
straggler_min_local_epoch: 0
external_model_params_file: ""
buffers: local # [local, global, drop]
optimizer:
name: sgd # [sgd, adam, adamw, rmsprop, adagrad]
lr: 0.01
dampening: 0 # SGD
weight_decay: 0
momentum: 0 # [SGD, RMSprop]
alpha: 0.99 # RMSprop
nesterov: false # SGD
betas: [0.9, 0.999] # [Adam, AdamW]
amsgrad: false # [Adam, AdamW]
lr_scheduler:
name: step # null for deactivating
step_size: 10
eval_test: true
eval_val: false
eval_train: false
verbose_gap: 10
visible: false
use_cuda: true
save_log: true
save_model: false
save_fig: true
save_metrics: true
check_convergence: true
# You can set specific arguments for FL methods also
# FL-bench uses FL method arguments by args.<method>.<arg>
# e.g.
fedprox:
mu: 0.01
pfedsim:
warmup_round: 0.7
# ...
# NOTE: For those unmentioned arguments, the default values are set in `get_<method>_args()` in `src/server/<method>.py`
python generate_data.py -d mnist -a 0.1 -cn 100
python main.py fedavg cfg.yml
thanks for your response . could I ask you what config I can use for resnet18 and cifar10 to get the best accuracy?
There are tons of variables that can affect the final accuracy. Sorry I can't tell you the optimal config.
is there a config that you used and got a reasonable response? thanks
Just try it yourself.
I'm runnin the default run (python main.py fedavg config/template.yml) . I'm getting the following report :
client [79] (test) loss: 0.3858 -> 0.3872 accuracy: 88.50% -> 88.00% client [28] (test) loss: 0.1150 -> 0.1162 accuracy: 97.62% -> 97.62% client [99] (test) loss: 0.2672 -> 0.2528 accuracy: 94.20% -> 94.72% Training... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:09:36 FedAvg's average time taken by each global epoch: 0 min 5.73 sec. FedAvg's total running time: 0 h 9 m 36 s. ==================== FedAvg Experiment Results: ==================== Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal. {100: {'all_clients': {'test': {'loss': '0.3384 -> 0.0000', 'accuracy': '91.31% -> 0.00%'}}}} ========== FedAvg Convergence on train clients ========== test (before local training): 10.0%(13.14%) at epoch: 1 20.0%(24.33%) at epoch: 3 60.0%(63.00%) at epoch: 7 70.0%(74.64%) at epoch: 9 80.0%(82.61%) at epoch: 16 90.0%(91.24%) at epoch: 40 test (after local training): 80.0%(81.93%) at epoch: 0 90.0%(90.20%) at epoch: 1 ==================== FedAvg Max Accuracy ==================== all_clients: (test) before fine-tuning: 91.31% at epoch 100 (test) after fine-tuning: 0.00% at epoch 100
why after fine tunning accuracy is showing 0% ??
thanks for your help.