liuyukid / transformers-ner

Pytorch-Named-Entity-Recognition-with-transformers
208 stars 45 forks source link

sh run_crf_ner.sh run error #9

Open hhdo opened 3 years ago

hhdo commented 3 years ago

I don't know why occur this problem, thank you for reading

OSError: Model name '../pretrained_models/RoBERTa-zh-Large/' was not found in tokenizers model name list (roberta-base, roberta-large, roberta-large-mnli, distilroberta-base, roberta-base-openai-detector, roberta-large-openai-detector). We assumed '../pretrained_models/RoBERTa-zh-Large/' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

I have download all 3 file, including config.json\pytorch_model.bin\vocab.txt

Thank you in advance~

liuyukid commented 3 years ago

Maybe you're actually using a Bert model。

hhdo commented 3 years ago

Maybe you're actually using a Bert model。

.sh file like this below.

#!/bin/bash

DATA_DIR='../datasets/mydataset/'
MODEL_TYPE='roberta'
MODEL_NAME_OR_PATH='../pretrained_models/RoBERTa-zh-Large/'
OUTPUT_DIR='../output/roberta/'
LABEL='../datasets/cluener/labels.txt'

CUDA_VISIBLE_DEVICES='1' python ../examples/run_crf_ner.py \
--data_dir $DATA_DIR \
--model_type $MODEL_TYPE \
--model_name_or_path $MODEL_NAME_OR_PATH \
--output_dir $OUTPUT_DIR \
--labels $LABEL \
--overwrite_output_dir \
--do_train \
--do_eval \
--evaluate_during_training \
--adv_training fgm \
--num_train_epochs 3 \
--max_seq_length 128 \
--logging_steps 0.2 \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 16 \
--learning_rate 5e-5 \
--bert_lr 5e-5 \
--classifier_lr  5e-5 \
--crf_lr  1e-3 \