timoschick / pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"
https://arxiv.org/abs/2001.07676
Apache License 2.0
1.62k stars 285 forks source link

OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list #100

Closed hjing100 closed 1 year ago

hjing100 commented 1 year ago

运行: python3 cli.py \ --method pet \ --pattern_ids 0 \ --data_dir /home/123456/projects/prompt/pet-master/generate_data/ \ --model_type albert \ --model_name_or_path clue/albert_chinese_tiny \ --task_name porn-task \ --output_dir /home/123456/projects/prompt/pet-master/porn-output/ \ --do_train \ --do_eval \ --pet_per_gpu_train_batch_size 2 \ --pet_gradient_accumulation_steps 8 \ --pet_max_steps 250 \ --sc_per_gpu_unlabeled_batch_size 2 \ --sc_gradient_accumulation_steps 8 \ --sc_max_steps 100 报错: OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list (albert-base-v1, albert-large-v1, albert-xlarge-v1, albert-xxlarge-v1, albert-base-v2, albert-large-v2, albert-xlarge-v2, albert-xxlarge-v2). We assumed 'clue/albert_chinese_tiny' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url. 请问在哪里可以看到可以使用的所有模型? 可以用于中文语料的模型训练和使用吗?

hjing100 commented 1 year ago

换模型xlm-roberta-base 报错: Traceback (most recent call last): File "cli.py", line 282, in main() File "cli.py", line 263, in main no_distillation=args.no_distillation, seed=args.seed) File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 249, in train_pet save_unlabeled_logits=not no_distillation, seed=seed) File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 341, in train_pet_ensemble wrapper = init_model(model_config) File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 146, in init_model model = TransformerModelWrapper(config) File "/home/123456/projects/prompt/pet-master/pet/wrapper.py", line 151, in init cache_dir=config.cache_dir if config.cache_dir else None) # type: PreTrainedTokenizer File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1140, in from_pretrained return cls._from_pretrained(*inputs, *kwargs) File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1287, in _from_pretrained tokenizer = cls(init_inputs, init_kwargs) File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_roberta.py", line 171, in init kwargs, File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_gpt2.py", line 167, in init with open(vocab_file, encoding="utf-8") as vocab_handle: TypeError: expected str, bytes or os.PathLike object, not NoneType

hjing100 commented 1 year ago

升级python包transformers==4.3.0,问题解决