taishan1994 / sentencepiece_chinese_bpe

使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。
97 stars 14 forks source link

python chinese_bpe.py 报错如下: #3

Open hanyong-max opened 3 months ago

hanyong-max commented 3 months ago

Traceback (most recent call last): File "D:\sentencepiece_chinese_bpe-main\chinese_bpe.py", line 23, in tokenizer = ChineseTokenizer(vocab_file=output_dir + 'chinese.model') File "D:\sentencepiece_chinese_bpe-main\tokenization.py", line 81, in init super().init( File "D:\anaconda3\lib\site-packages\transformers\tokenization_utils.py", line 367, in init self._add_tokens( File "D:\anaconda3\lib\site-packages\transformers\tokenization_utils.py", line 467, in _add_tokens current_vocab = self.get_vocab().copy() File "D:\sentencepiece_chinese_bpe-main\tokenization.py", line 115, in get_vocab vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)} File "D:\sentencepiece_chinese_bpe-main\tokenization.py", line 111, in vocab_size return self.sp_model.get_piece_size() AttributeError: 'ChineseTokenizer' object has no attribute 'sp_model' 如何解决???