Note复现 - Githubissues

447428054 commented 3 years ago

@liuwei1206 您好，我的脚本参照给出的note4 shell设置如下: CUDA_VISIBLE_DEVICES=0 python3 -m torch.distributed.launch --master_port 13017 --nproc_per_node=1 \ Trainer.py --do_train --do_eval --do_predict --evaluate_during_training \ --data_dir="data/dataset/NER/note4" \ --output_dir="data/result/NER/note4/wcbertcrf" \ --config_name="data/berts/bert/config.json" \ --model_name_or_path="/home/root1/lizheng/pretrainModels/torch/chinese/bert-base-chinese/pytorch_model.bin" \ --vocab_file="/home/root1/lizheng/pretrainModels/torch/chinese/bert-base-chinese/vocab.txt" \ --word_vocab_file="data/vocab/tencent_vocab.txt" \ --max_scan_num=1500000 \ --max_word_num=5 \ --label_file="data/dataset/NER/note4/labels.txt" \ --word_embedding="data/embedding/word_embedding.txt" \ --saved_embedding_dir="data/dataset/NER/note4" \ --model_type="WCBertCRF_Token" \ --seed=106524 \ --per_gpu_train_batch_size=4 \ --per_gpu_eval_batch_size=32 \ --learning_rate=1e-5 \ --max_steps=-1 \ --max_seq_length=256 \ --num_train_epochs=20 \ --warmup_steps=190 \ --save_steps=600 \ --logging_steps=300

但结果test F1为80左右，是否因为您是多卡训练，我是单卡训练的差异，可否看一下脚本是否无误

liuwei1206 commented 3 years ago

Hi,

Your setting seems ok. I am not sure if the multi-gpu will make such a difference.

447428054 commented 3 years ago

@liuwei1206 您好，那请问您是几张卡训练的呢

liuwei1206 commented 3 years ago

@liuwei1206 您好，那请问您是几张卡训练的呢

one

liuwei1206 / LEBERT

Note复现 #26