shibing624 / text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
https://pypi.org/project/text2vec/
Apache License 2.0
4.39k stars 392 forks source link

你好,我跑了这个文件training_sup_text_matching_model_jsonl_data为啥报错呢 #123

Closed aslick closed 1 year ago

aslick commented 1 year ago

执行: python training_sup_text_matching_model_jsonl_data.py --model_arch cosent --do_train --do_predict --num_epochs 4 --model_name nghuyong/ernie-3.0-base-zh --output_dir ./outputs/STS-cosent 报错信息:pyo3_runtime.PanicException: no entry found for key 具体: thread '' panicked at 'no entry found for key', C:\Users\builder\AppData\Local\Temp\pip-req-build-hhcj09m8\tokenizers-lib\src\models\mod.rs:36:66 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Epoch: 0%| | 0/4 [03:41<?, ?it/s] Traceback (most recent call last): File "C:\Users\hayden\downloads\text2vec-master_1\text2vec-master\examples\training_sup_text_matching_model_jsonl_data.py", line 132, in main() File "C:\Users\hayden\downloads\text2vec-master_1\text2vec-master\examples\training_sup_text_matching_model_jsonl_data.py", line 82, in main model.train_model( File "C:\Users\hayden\downloads\text2vec-master_1\text2vec-master\examples..\text2vec\cosent_model.py", line 111, in train_model global_step, training_details = self.train( File "C:\Users\hayden\downloads\text2vec-master_1\text2vec-master\examples..\text2vec\cosent_model.py", line 303, in train self.save_model(output_dir, model=self.bert, results=results) File "C:\Users\hayden\downloads\text2vec-master_1\text2vec-master\examples..\text2vec\sentence_model.py", line 277, in save_model self.tokenizer.save_pretrained(output_dir) File "C:\Users\hayden\anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 2246, in save_pretrained save_files = self._save_pretrained( File "C:\Users\hayden\anaconda3\lib\site-packages\transformers\tokenization_utils_fast.py", line 622, in _save_pretrained self.backend_tokenizer.save(tokenizer_file) pyo3_runtime.PanicException: no entry found for key

shibing624 commented 1 year ago

训练不要用fast tokenizer

aslick commented 1 year ago

你好,我怎么避免用fast tokenizer,我是小白,不太懂

shibing624 commented 1 year ago

百度下。