Hi,
I was wondering what setting the training time of 2 hours that you report in the paper refers to:
Is this the training time of fine-tuning lora-shepherd-7b for one client only? Or what is the setting in this case?
The implementation provided with 5 clients per round (0.05 participation over 100 clients in total on Databricks-dolly-15k) and 20 communications rounds on one single GPU NVIDIA A100 is around 13.5 hours on my side.
Training Time
Hi, I was wondering what setting the training time of 2 hours that you report in the paper refers to:
Is this the training time of fine-tuning lora-shepherd-7b for one client only? Or what is the setting in this case?
The implementation provided with 5 clients per round (0.05 participation over 100 clients in total on Databricks-dolly-15k) and 20 communications rounds on one single GPU NVIDIA A100 is around 13.5 hours on my side.