jianzhnie / LLamaTuner

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
https://jianzhnie.github.io/llmtech/
Apache License 2.0
547 stars 60 forks source link

ValueError: Undefined dataset tatsu-lab/alpaca #81

Closed SeekPoint closed 11 months ago

SeekPoint commented 11 months ago

(yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$ (yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$ python train_qlora.py --model_name_or_path /home/amd00/hf_model/llama-7b --output_dir ./out-llama-7b --dataset_name tatsu-lab/alpaca [2023-08-03 17:54:26,303] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/train_qlora.py", line 102, in main() File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/train_qlora.py", line 31, in main data_args.init_for_training() File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/chatllms/configs/data_args.py", line 95, in init_for_training raise ValueError('Undefined dataset {} in {}'.format( ValueError: Undefined dataset tatsu-lab/alpaca in /home/amd00/llm_dev/Efficient-Tuning-LLMs/chatllms/configs/../../data/dataset_info.yaml (yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$

SeekPoint commented 11 months ago

(yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$ git diff diff --git a/data/dataset_info.yaml b/data/dataset_info.yaml index 47dc433..b5ff767 100644 --- a/data/dataset_info.yaml +++ b/data/dataset_info.yaml @@ -1,7 +1,7 @@

The dataset_info.yaml file contains the information of the datasets used in the experiments.

alpaca: hf_hub_url: tatsu-lab/alpaca

(yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$

yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$ (yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$ python train_qlora.py --model_name_or_path /home/amd00/hf_model/llama-7b --output_dir ./out-llama-7b --dataset_name alpaca --num_train_epochs 4 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --evaluation_strategy steps --eval_steps 50 --save_strategy steps --save_total_limit 5 --save_steps 100 --logging_strategy steps --logging_steps 1 --learning_rate 0.0002 --warmup_ratio 0.03 --weight_decay 0.0 --lr_scheduler_type constant --adam_beta2 0.999 --max_grad_norm 0.3 --max_new_tokens 32 --lora_r 64 --lora_alpha 16 --lora_dropout 0.1 --double_quant --quant_type nf4 --fp16 --bits 4 --gradient_checkpointing --trust_remote_code --do_train --do_eval --sample_generate --data_seed 42 --seed 0 [2023-08-03 18:09:46,641] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/train_qlora.py", line 102, in main() File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/train_qlora.py", line 31, in main data_args.init_for_training() File "/home/amd00/llm_dev/Efficient-Tuning-LLMs/chatllms/configs/data_args.py", line 114, in init_for_training raise Warning( Warning: You have set local_path for alpaca but it does not exist! Will load the data from tatsu-lab/alpaca (yk_py39) amd00@MZ32-00:~/llm_dev/Efficient-Tuning-LLMs$

jianzhnie commented 11 months ago

Warning: You have set local_path for alpaca but it does not exist!

jianzhnie commented 11 months ago

please visit README.md to see how to use the dataset