运行:
python3 cli.py \
--method pet \
--pattern_ids 0 \
--data_dir /home/123456/projects/prompt/pet-master/generate_data/ \
--model_type albert \
--model_name_or_path clue/albert_chinese_tiny \
--task_name porn-task \
--output_dir /home/123456/projects/prompt/pet-master/porn-output/ \
--do_train \
--do_eval \
--pet_per_gpu_train_batch_size 2 \
--pet_gradient_accumulation_steps 8 \
--pet_max_steps 250 \
--sc_per_gpu_unlabeled_batch_size 2 \
--sc_gradient_accumulation_steps 8 \
--sc_max_steps 100
报错:
OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list (albert-base-v1, albert-large-v1, albert-xlarge-v1, albert-xxlarge-v1, albert-base-v2, albert-large-v2, albert-xlarge-v2, albert-xxlarge-v2). We assumed 'clue/albert_chinese_tiny' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url.
请问在哪里可以看到可以使用的所有模型?
可以用于中文语料的模型训练和使用吗?
换模型xlm-roberta-base
报错:
Traceback (most recent call last):
File "cli.py", line 282, in
main()
File "cli.py", line 263, in main
no_distillation=args.no_distillation, seed=args.seed)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 249, in train_pet
save_unlabeled_logits=not no_distillation, seed=seed)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 341, in train_pet_ensemble
wrapper = init_model(model_config)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 146, in init_model
model = TransformerModelWrapper(config)
File "/home/123456/projects/prompt/pet-master/pet/wrapper.py", line 151, in init
cache_dir=config.cache_dir if config.cache_dir else None) # type: PreTrainedTokenizer
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1140, in from_pretrained
return cls._from_pretrained(*inputs, *kwargs)
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1287, in _from_pretrained
tokenizer = cls(init_inputs, init_kwargs)
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_roberta.py", line 171, in initkwargs,
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_gpt2.py", line 167, in init
with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType
运行: python3 cli.py \ --method pet \ --pattern_ids 0 \ --data_dir /home/123456/projects/prompt/pet-master/generate_data/ \ --model_type albert \ --model_name_or_path clue/albert_chinese_tiny \ --task_name porn-task \ --output_dir /home/123456/projects/prompt/pet-master/porn-output/ \ --do_train \ --do_eval \ --pet_per_gpu_train_batch_size 2 \ --pet_gradient_accumulation_steps 8 \ --pet_max_steps 250 \ --sc_per_gpu_unlabeled_batch_size 2 \ --sc_gradient_accumulation_steps 8 \ --sc_max_steps 100 报错: OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list (albert-base-v1, albert-large-v1, albert-xlarge-v1, albert-xxlarge-v1, albert-base-v2, albert-large-v2, albert-xlarge-v2, albert-xxlarge-v2). We assumed 'clue/albert_chinese_tiny' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url. 请问在哪里可以看到可以使用的所有模型? 可以用于中文语料的模型训练和使用吗?