eosphoros-ai / DB-GPT-Hub

A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
MIT License
1.47k stars 186 forks source link

超参数设定 #113

Closed JBoRu closed 1 year ago

JBoRu commented 1 year ago

请问codellama-13b直接在spider-train set上训练这个版本的具体参数设定为多少?

wangzaistone commented 1 year ago

请问codellama-13b直接在spider-train set上训练这个版本的具体参数设定为多少?

CUDA_VISIBLE_DEVICES=0 python dbgpt_hub/train/sft_train.py \
    --model_name_or_path Your_download_CodeLlama-13b-Instruct-hf_path \
    --do_train \
    --dataset example_text2sql_train \
    --max_source_length 2048 \
    --max_target_length 512 \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --template llama2 \
    --lora_rank 64 \
    --lora_alpha 32 \
    --output_dir dbgpt_hub/output/adapter/code_llama-13b-2048_epoch8_lora \
    --overwrite_cache \
    --overwrite_output_dir \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 16 \
    --lr_scheduler_type cosine_with_restarts \
    --logging_steps 50 \
    --save_steps 2000 \
    --learning_rate 2e-4 \
    --num_train_epochs 8 \
    --plot_loss \
    --bf16

硬件环境: 1张A100(40G)的服务器