Closed 631068264 closed 1 year ago
pretrain数据集主要是纯文本, 使用content字段即可 sft数据集为标准的alpaca数据格式 { "instruction": "组合以下单词以形成合理的句子。" -- 指令, "input": "世界/最/高/山峰/是/什么?" -- 输入, "output": "世界最高的山峰是珠穆朗玛峰。" -- 输出 }
本地微调完后,怎么应用新模型?
微调时候有报错
{'loss': 0.5684, 'learning_rate': 1e-05, 'epoch': 0.13}
3%|▎ | 10/395 [06:42<4:11:50, 39.25s/Traceback (most recent call last):
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 127, in <module>
main()
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 121, in main
trainer.train()
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2006, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2287, in _maybe_log_save_evaluate
metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2993, in evaluate
output = eval_loop(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 3281, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 43, in compute_metrics
metric = evaluate.load("accuracy")
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 731, in load
evaluation_module = evaluation_module_factory(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 680, in evaluation_module_factory
raise e1 from None
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 639, in evaluation_module_factory
).get_module()
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 479, in get_module
local_path = self.download_loading_script(revision)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 469, in download_loading_script
return cached_path(file_path, download_config=download_config)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/utils/file_utils.py", line 224, in cached_path
output_path = get_from_cache(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/utils/file_utils.py", line 614, in get_from_cache
http_get(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/utils/file_utils.py", line 395, in http_get
response = _request_with_retry(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/utils/file_utils.py", line 360, in _request_with_retry
raise err
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/utils/file_utils.py", line 356, in _request_with_retry
response = requests.request(method=method.upper(), url=url, timeout=timeout, **params)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/requests/adapters.py", line 501, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
3%|▎ | 10/395 [08:06<5:12:02, 48.63s/it]
Traceback (most recent call last):
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 127, in <module>
main()
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 121, in main
trainer.train()
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2006, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2287, in _maybe_log_save_evaluate
metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 2993, in evaluate
output = eval_loop(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/transformers/trainer.py", line 3281, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/data/home/yaokj5/dl/apps/TigerBot/train/./train_sft.py", line 43, in compute_metrics
metric = evaluate.load("accuracy")
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 731, in load
evaluation_module = evaluation_module_factory(
File "/data/home/yaokj5/anaconda3/envs/tigerbot/lib/python3.10/site-packages/evaluate/loading.py", line 681, in evaluation_module_factory
raise FileNotFoundError(
FileNotFoundError: Couldn't find a module script at /data/home/yaokj5/dl/apps/TigerBot/train/accuracy/accuracy.py. Module 'accuracy' doesn't exist on the Hugging Face Hub either.
[2023-06-09 17:23:38,808] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2911
[2023-06-09 17:23:38,809] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2912
重新安装下evaluate试试: pip install evaluate
然后脚本试下 import evaluate metric = evaluate.load("accuracy") 看会不会报错
本地微调完后,怎么应用新模型?
这个呢
本地微调完后,怎么应用新模型?
这个呢
使用train完成后生成的checkpoint文件加载模型即可。
数据集一个长什么样的,什么格式,对应列含义是什么