Open licx102359 opened 3 months ago
将 \chatglm_tokenizer\tokenization_chatglm.py
文件中 ChatGLMTokenizer
类的 __init__
函数中的 self.tokenizer = SPTokenizer(vocab_file)
这一行移动至 super().__init__(...)
这一行的上方即可。
将
\chatglm_tokenizer\tokenization_chatglm.py
文件中ChatGLMTokenizer
类的__init__
函数中的self.tokenizer = SPTokenizer(vocab_file)
这一行移动到super().__init__(...)
上面的这一行即可。 是transformers版本问题
File "/qiuwkai27/cx/baby-llama2-chinese/sft.py", line 274, in
tokenizer=ChatGLMTokenizer(vocab_file='./chatglm_tokenizer/tokenizer.model')
File "/qiuwkai27/cx/baby-llama2-chinese/chatglm_tokenizer/tokenization_chatglm.py", line 68, in init
super().init(padding_side=padding_side, clean_up_tokenization_spaces=clean_up_tokenization_spaces, **kwargs)
File "/root/miniconda3/envs/cxx/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 436, in init
self._add_tokens(
File "/root/miniconda3/envs/cxx/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 544, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/qiuwkai27/cx/baby-llama2-chinese/chatglm_tokenizer/tokenization_chatglm.py", line 110, in get_vocab
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
File "/qiuwkai27/cx/baby-llama2-chinese/chatglm_tokenizer/tokenization_chatglm.py", line 106, in vocab_size
return self.tokenizer.n_words
AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'. Did you mean: 'tokenize'?
我看文件定义了啊,为什么还是报这种错误