Open UnstoppableCurry opened 2 years ago
看报错情况,应该是模型配置文件的编码问题
看报错情况,应该是模型配置文件的编码问题
谢谢,太有爱啦 已经三连不白嫖 自己用k80跑100的数据集12个小时 哈哈哈哈哈哈哈哈哈哈或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或或裂开
nice,恭喜
大佬问一下 model = GPT2LMHeadModel.from_pretrained(args.model_path) 这个api 用config 和bin 模型都可以,但是您刚才说的config.json文件 好像得用训练的写法 才能跑起来
model_config = GPT2Config.from_json_file(args.model_config)
model = GPT2LMHeadModel.from_pretrained(args.model_path,config=model_config)
model = model.to(device)
要不在master上注释上这种写法? transform和torchvirsion版本好多坑。。我作为小白也是没搞特别明白 嘻嘻希望能帮到其他人 为你的开源精神respect 棒棒的
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
2022-04-10 18:42:36,539 - INFO - using device:cuda Traceback (most recent call last): File "G:/pycharmWorkspace/CV/打工人兼职项目/贾维斯/interact.py", line 186, in main() File "G:/pycharmWorkspace/CV/打工人兼职项目/贾维斯/interact.py", line 125, in main model = GPT2LMHeadModel.from_pretrained('model/pytorch_model.bin') File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 362, in from_pretrained config = cls.config_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 139, in from_pretrained config = cls.from_json_file(resolved_config_file) File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 165, in from_json_file text = reader.read() File "c:\users\86183\appdata\local\programs\python\python36\lib\codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
版本与要求的一致,删了tf的缓存也不行,使用提供的模型与自己训练的模型都会报错,不知道为什么
我也遇到了同样的问题,请问解决了吗?
@yiting888 你先定义一个 args 的参数:model_config,然后采用
model_config = GPT2Config.from_json_file(args.model_config)
model = GPT2LMHeadModel.from_pretrained(args.model_path,config=model_config)
model = model.to(device)
启动的时候设置好 model_config。 最后记得将 config 文件的编码设置成 UTF-8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
2022-04-10 18:42:36,539 - INFO - using device:cuda Traceback (most recent call last): File "G:/pycharmWorkspace/CV/打工人兼职项目/贾维斯/interact.py", line 186, in
main()
File "G:/pycharmWorkspace/CV/打工人兼职项目/贾维斯/interact.py", line 125, in main
model = GPT2LMHeadModel.from_pretrained('model/pytorch_model.bin')
File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 362, in from_pretrained
config = cls.config_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 139, in from_pretrained
config = cls.from_json_file(resolved_config_file)
File "C:\Users\86183\Envs\torch\lib\site-packages\pytorch_transformers\modeling_utils.py", line 165, in from_json_file
text = reader.read()
File "c:\users\86183\appdata\local\programs\python\python36\lib\codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
版本与要求的一致,删了tf的缓存也不行,使用提供的模型与自己训练的模型都会报错,不知道为什么