THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Apache License 2.0
3.52k stars 255 forks source link

pickle tokenizer error #287

Closed shanren521 closed 6 hours ago

shanren521 commented 5 days ago

System Info / 系統信息

docker

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

model = AutoModel.from_pretrained(path, low_cpu_mem_usage=True, torch_dtype=torch.bfloat16, trust_remote_code=True, ) print("load model success") tokenizer = AutoTokenizer.from_pretrained(path, trust_remotecode=True) tokenizer = pickle.dumps(tokenizer, protocol=pickle.HIGHEST_PROTOCOL) loadtokenizer = pickle.loads(tokenizer)

Traceback (most recent call last): File "/home/hadoop-aipnlp/project/multimodal_files/mid_journeytest.py", line 82, in tokenizer = pickle.dumps(tokenizer, protocol=pickle.HIGHEST_PROTOCOL) TypeError: cannot pickle 'builtins.CoreBPE' object

how I can fix it?

Expected behavior / 期待表现

i expect fix it

shanren521 commented 5 days ago

i use glm-4-9b

zRzRzRzRzRzRzR commented 5 days ago

好像不能这么用吧,这样做的意义是