pickle tokenizer error - Githubissues

shanren521 commented 5 days ago

System Info / 系統信息

docker

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

[ ] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

model = AutoModel.from_pretrained(path, low_cpu_mem_usage=True, torch_dtype=torch.bfloat16, trust_remote_code=True, ) print("load model success") tokenizer = AutoTokenizer.from_pretrained(path, trust_remotecode=True) tokenizer = pickle.dumps(tokenizer, protocol=pickle.HIGHEST_PROTOCOL) loadtokenizer = pickle.loads(tokenizer)

Traceback (most recent call last): File "/home/hadoop-aipnlp/project/multimodal_files/mid_journeytest.py", line 82, in tokenizer = pickle.dumps(tokenizer, protocol=pickle.HIGHEST_PROTOCOL) TypeError: cannot pickle 'builtins.CoreBPE' object

how I can fix it?

Expected behavior / 期待表现

i expect fix it

shanren521 commented 5 days ago

i use glm-4-9b

zRzRzRzRzRzRzR commented 5 days ago

好像不能这么用吧，这样做的意义是

THUDM / GLM-4

pickle tokenizer error #287

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现