FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
7.19k stars 522 forks source link

LM_Cocktail融合模型之后出现PyPreTokenizerTypeWrapper的报错 #964

Open Zhouziyi828 opened 3 months ago

Zhouziyi828 commented 3 months ago

您好,之前我微调模型已经完成,融合模型也没有出问题,但是本周使用的时候突然发现,不论FlagEmbedding或者Huggingface的调用都会出现: File "/opt/conda/lib/python3.8/site-packages/FlagEmbedding/flag_reranker.py", line 158, in init self.tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, cache_dir=cache_dir) File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 814, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, *kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2029, in from_pretrained return cls._from_pretrained( File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2261, in _from_pretrained tokenizer = cls(init_inputs, **init_kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py", line 155, in init super().init( File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 111, in init fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 78 column 3 的报错,但是我环境和模型文件都没有更改过,如果调用融合之前的是没有问题的,调用融合之后的就会报错

Zhouziyi828 commented 3 months ago

重新拉取代码后解决

Zhouziyi828 commented 2 months ago

y又出现了 难道每次都要重新拉代码么