AnswerDotAI / fsdp_qlora

Training LLMs with QLoRA + FSDP
Apache License 2.0
1.42k stars 188 forks source link

DoRA #51

Closed KeremTurgutlu closed 6 months ago

KeremTurgutlu commented 6 months ago

Can be tested with the following:

python train.py \
--model_name meta-llama/Llama-2-7b-hf \
--dataset orca_math \
--dataset_samples 1000 \
--batch_size 8 \
--context_length 1024 \
--gradient_accumulation_steps 2 \
--train_type custom_qlora \
--sharding_strategy full_shard \
--use_gradient_checkpointing true \
--reentrant_checkpointing true \
--use_cpu_offload false \
--use_activation_cpu_offload false \
--log_to wandb \
--verbose true \
--project_name "fsdp-dora-tests" \
--save_model true \
--output_dir "/mnt/vol_b/models/llama-7b-orca-math-1k-bnb-qlora"

python train.py \
--model_name meta-llama/Llama-2-7b-hf \
--dataset orca_math \
--dataset_samples 1000 \
--batch_size 8 \
--context_length 1024 \
--gradient_accumulation_steps 2 \
--train_type qlora \
--sharding_strategy full_shard \
--use_gradient_checkpointing true \
--reentrant_checkpointing true \
--use_cpu_offload false \
--use_activation_cpu_offload false \
--log_to wandb \
--verbose true \
--project_name "fsdp-dora-tests" \
--save_model true \
--output_dir "/mnt/vol_b/models/llama-7b-orca-math-1k-bnb-hf-qlora"

python train.py \
--model_name meta-llama/Llama-2-7b-hf \
--dataset orca_math \
--dataset_samples 1000 \
--batch_size 8 \
--context_length 1024 \
--gradient_accumulation_steps 2 \
--train_type bnb_dora \
--sharding_strategy full_shard \
--use_gradient_checkpointing true \
--reentrant_checkpointing true \
--use_cpu_offload false \
--use_activation_cpu_offload false \
--log_to wandb \
--verbose true \
--project_name "fsdp-dora-tests" \
--save_model true \
--output_dir "/mnt/vol_b/models/llama-7b-orca-math-1k-bnb-qdora"

python train.py \
--model_name meta-llama/Llama-2-7b-hf \
--dataset orca_math \
--dataset_samples 1000 \
--batch_size 8 \
--context_length 1024 \
--gradient_accumulation_steps 2 \
--train_type hqq_dora \
--sharding_strategy full_shard \
--use_gradient_checkpointing true \
--reentrant_checkpointing true \
--use_cpu_offload false \
--use_activation_cpu_offload false \
--log_to wandb \
--verbose true \
--project_name "fsdp-dora-tests" \
--save_model true \
--output_dir "/mnt/vol_b/models/llama-7b-orca-math-1k-hqq-qdora"

Results: https://wandb.ai/answerdotai/fsdp-dora-tests