rui-ye / OpenFedLLM

Apache License 2.0
334 stars 51 forks source link

OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

OpenFedLLM is an open-source research-use codebase for training Large Language Models (LLM) via federated learning. Please check our paper for details and the corresponding empirical study.

OpenFedLLM includes the following key features:

intro

News🔥

Setup

Clone the repo, submodules and install the required packages.

git clone --recursive --shallow-submodules https://github.com/rui-ye/OpenFedLLM.git
cd OpenFedLLM
conda create -n fedllm python=3.10
conda activate fedllm
pip install -r requirements.txt
source setup.sh

Training

We provide training scripts under training_scripts/. Try them out from the top-level directory of this repository.

Federated Instruction Tuning

The training script is in training_scripts/run_sft.sh.

CUDA_VISIBLE_DEVICES=1 python main_sft.py \
 --model_name_or_path "meta-llama/Llama-2-7b-hf" \
 --dataset_name "vicgalle/alpaca-gpt4" \
 --dataset_sample 20000 \
 --fed_alg "fedavg" \
 --num_clients 20 \
 --sample_clients 2 \
 --max_steps 10 \
 --num_rounds 200 \
 --batch_size 16 \
 --gradient_accumulation_steps 1 \
 --seq_length 512 \
 --peft_lora_r 32 \
 --peft_lora_alpha 64 \
 --use_peft \
 --load_in_8bit \
 --output_dir "./output" \
 --template "alpaca" \

Key arguments:

Federated Value Alignment

The training script is in training_scripts/run_dpo.sh.

python main_dpo.py --template "vicuna_v1.1"

Note that the main difference between the usage of main_sft.py and main_dpo.py lies in the template argument. We plan to make them consistent in the future.

Evaluation

Evaluation codes are put in evaluation/ directory. Most of our evaluations follow existing high-incluence open-source repos. Please refer to each sub-directory for the corresponding detailed README and running script.

For example, evaluation/open_ended/ include open-ended evaluations on three benchmarks, covering MT-Bench, Vicuna Bench, and AdvBench; see README.md.

Citation

Please cite our paper if you find the repository helpful.

@article{ye2024openfedllm,
  title={OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning},
  author={Ye, Rui and Wang, Wenhao and Chai, Jingyi and Li, Dihan and Li, Zexi and Xu, Yinda and Du, Yaxin and Wang, Yanfeng and Chen, Siheng},
  journal={arXiv preprint arXiv:2402.06954},
  year={2024}
}