Closed Mr1994 closed 8 months ago
截图挂了。不过从你上面的记录看,是样本数量太少,导致分训练集和测试集时,训练集为空,建议增加样本试试
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
提交前必须检查以下项目
问题类型
模型训练与精调
基础模型
Chinese-LLaMA-2 (7B/13B)
操作系统
Linux
详细描述问题
pretrained_model=/llm/llama.cpp/models/chinese-alpaca-2-7b-hf // 这里是我模型存放的位置 我下载的是7b基座模型 chinese_tokenizer_path=/llm/Chinese-LLaMA-Alpaca-2/scripts/tokenizer .// 这里是仓库 tokenizer 的目录 dataset_dir=/llm/Chinese-LLaMA-Alpaca-2/dataset // 这个是我想要训练的数据 内容为:您知道孙中山先生吗 :他是世界上最伟大的人 data_cache=/llm/Chinese-LLaMA-Alpaca-2/temp_data_cache_dir // 这个是我想要输入的模型位置 per_device_train_batch_size=1 gradient_accumulation_steps=8 block_size=512 output_dir=output_dir