Closed nghuyong closed 3 years ago
Could you share the command you are using to launch the script? I'm trying to reproduce but it works fine for me. Also your error seems like a CUDA setup error, so is the script running properly without the change?
@sgugger
export BASE_PATH=/data/huyong/code/socialbert
export CUDA_VISIBLE_DEVICES=1
python run_mlm.py \
--config_name $BASE_PATH/pretrained_models/bert\
--model_type bert \
--max_seq_length 128 \
--preprocessing_num_workers 20 \
--model_name_or_path $BASE_PATH/pretrained_models/bert \
--train_file $BASE_PATH/data/mini.txt \
--line_by_line \
--do_train \
--save_total_limit 3 \
--per_device_train_batch_size 8 \
--max_train_samples 100000 \
--output_dir $BASE_PATH/checkpoint/bert
Thanks, but no one will be able to help you if you're using a personal model you don't share, as we can't debug something we can't reproduce. Also, you did not tell us if the script was running fine before the change.
@sgugger
Thanks.
Actually, I don't use my personal model, and the model I use to continue pre-train is the hfl/chinese-roberta-wwm-ext. And I manually download three files: vocab.txt
,config.json
and pytorch_model.bin
, and run the script by specifying the model dir and get wrong. But when I directly use the model name like the following, and it works!
export BASE_PATH=/data/huyong/code/socialbert
export CUDA_VISIBLE_DEVICES=1
python run_mlm.py \
--config_name hfl/chinese-roberta-wwm-ext \
--model_name_or_path hfl/chinese-roberta-wwm-ext \
--model_type bert \
--max_seq_length 128 \
--preprocessing_num_workers 20 \
--train_file $BASE_PATH/data/mini.txt \
--line_by_line \
--do_train \
--save_total_limit 3 \
--per_device_train_batch_size 8 \
--max_train_samples 100000 \
--output_dir $BASE_PATH/checkpoint/bert
Thanks a lot !
Environment info
transformers
version: version: 4.3.3Who can help
@LysandreJik, @LysandreJik
Information
I fine-tune BERT on my own social media data, I do this follow the instruction in the
examples/language-modeling/README.md
. I follow the official run_mlm.py file, and the only change is that I add some new tokens after the tokenizer inits, and then I got the Cuda runtime error. If I don't add some new tokens, it works well.The problem arises when using:
The tasks I am working on is:
To reproduce
Only add one line in
examples/language-modeling/run_mlm.py
start from run_mlm.py L291: https://github.com/huggingface/transformers/blob/838f83d84ccf57f968e0ace7f400e43b92643552/examples/language-modeling/run_mlm.py#L291
running log
Expected behavior
I found in the
run_mlm.py
, it hasmodel.resize_token_embeddings(len(tokenizer))
, Why still get the error ? Thanks