TsingZ0 / PFLlib

37 traditional FL (tFL) or personalized FL (pFL) algorithms, 3 scenarios, and 20 datasets.
GNU General Public License v2.0
1.44k stars 298 forks source link

How to accelerate the training? #127

Closed FibonacciZ closed 1 year ago

FibonacciZ commented 1 year ago

I used the following command for single-card training on a 3090ti: "nohup python -u main.py -t 3 -lr 0.1 -lbs 10 -ls 1 -nc 20 -nb 200 -data Tiny-imagenet -m resnet -algo FedCP -did 0 > ../result/imagenet_FedCP_res.out 2>&1 &."

It took 10 hours to complete only 160 rounds, and the single GPU utilization was only 4895MB. Is it possible to train the algorithm on multiple GPUs, or how can I fully utilize a single GPU for training? Would multithreading be beneficial in this case?

Could you share any strategies for accelerating training that you have used?

TsingZ0 commented 1 year ago

Training ResNet-18 with 20 clients on Tiny-ImageNet in 10 hours is a normal speed. You can consider using built-in multithreading support, but be aware of potential management costs.

FibonacciZ commented 1 year ago

Training ResNet-18 with 20 clients on Tiny-ImageNet in 10 hours is a normal speed. You can consider using built-in multithreading support, but be aware of potential management costs.

What you've written in your paper, needing to train for 2000 rounds, doesn't that mean it will take about 7 days to complete at this rate?

TsingZ0 commented 1 year ago

Indeed, running experiments is time-consuming.