SCIR-HI / Huatuo-Llama-Med-Chinese

Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
Apache License 2.0
4.31k stars 422 forks source link

UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 93: illegal multibyte sequence #90

Open cookie925 opened 9 months ago

cookie925 commented 9 months ago

Traceback (most recent call last): File "E:\wbjdata\QA\Huatuo-Llama-Med-Chinese-main\Huatuo-Llama-Med-Chinese-main\1.py", line 124, in fire.Fire(main) File "E:\anaconda3\envs\LLAMA10\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "E:\anaconda3\envs\LLAMA10\lib\site-packages\fire\core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "E:\anaconda3\envs\LLAMA10\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "E:\wbjdata\QA\Huatuo-Llama-Med-Chinese-main\Huatuo-Llama-Med-Chinese-main\1.py", line 36, in main prompter = Prompter(prompt_template) File "E:\wbjdata\QA\Huatuo-Llama-Med-Chinese-main\Huatuo-Llama-Med-Chinese-main\utils\prompter.py", line 22, in init self.template = json.load(fp) File "E:\anaconda3\envs\LLAMA10\lib\json__init__.py", line 293, in load return loads(fp.read(), UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 93: illegal multibyte sequence 请问大家为什么会出现这种情况啊,是哪个文件出错了呢

s65b40 commented 9 months ago

你好,大概率是读template的时候的编码问题,请pull仓库最新的版本并尝试,刚刚添加了utf-8编码

cookie925 commented 9 months ago

你好,大概率是读template的时候的编码问题,请pull仓库最新的版本并尝试,刚刚添加了utf-8编码

谢谢,我等下试试。 我前天把prompter.py 文件21行的代码with open(file_name) as fp: 改成了 with open(file_name,encoding=‘utf-8’) 它意外的跑起来了,不过就是回答的时候有些特殊符号。