Closed Deerzh closed 2 years ago
{YOUR_OTHER_ARGUMENTS}
can be left empty. Or you can refer to all the arguments here: https://github.com/allanj/pytorch_neural_crf/blob/master/transformers_trainer.py#L29-L61I update the code,but errors still exist. Error1. when I run this command:python trainer.py --embedder_type=bert-large-cased error like this : usage: trainer.py [-h] [--device {cpu,cuda:0,cuda:1,cuda:2}] [--seed SEED] [--dataset DATASET] [--embedding_file EMBEDDING_FILE] [--embedding_dim EMBEDDING_DIM] [--optimizer OPTIMIZER] [--learning_rate LEARNING_RATE] [--l2 L2] [--lr_decay LR_DECAY] [--batch_size BATCH_SIZE] [--num_epochs NUM_EPOCHS] [--train_num TRAIN_NUM] [--dev_num DEV_NUM] [--test_num TEST_NUM] [--max_no_incre MAX_NO_INCRE] [--model_folder MODEL_FOLDER] [--hidden_dim HIDDEN_DIM] [--dropout DROPOUT] [--use_char_rnn {0,1}] [--static_context_emb {none,elmo}] [--add_iobes_constraint {0,1}] trainer.py: error: unrecognized arguments: --embedder_type=bert-large-cased
Error: if I left this {YOUR_OTHER_ARGUMENTS} empty, error still occurred :
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 22, in
Following the README, you should run transformer_trainer
rather than trainer.py
For the second one.. you need to
pip install datasets
I just updated the README to include that. Thanks
Following the README, you should run
transformer_trainer
rather thantrainer.py
Thank you for your reply,but there are still have some questions about this. Q1:Is that I firstly run transformer_trainer and secondly run trainer.py or just run transformer_trainer.py? I don't understand your meaning. Because if I run trainer.py command with '--embedder_type=bert-large-cased' argument,it will raise an error,however if I run trainer.py without arguments, it will be successfully?
Q2 : I have pip install datasets.but when I run accelerate launch transformers_trainer_ddp.py --batch_size=30, error still occurred,like this:
The following values were not passed to accelerate launch
and had defaults used instead:
--num_processes
was set to a value of 1
--num_machines
was set to a value of 1
--mixed_precision
was set to a value of 'no'
--num_cpu_threads_per_process
was set to 52
to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config
.
09/02/2022 16:16:35 - INFO - main - seed: 42
09/02/2022 16:16:35 - INFO - main - dataset: conll2003
09/02/2022 16:16:35 - INFO - main - optimizer: adamw
09/02/2022 16:16:35 - INFO - main - learning_rate: 2e-05
09/02/2022 16:16:35 - INFO - main - momentum: 0.0
09/02/2022 16:16:35 - INFO - main - l2: 1e-08
09/02/2022 16:16:35 - INFO - main - lr_decay: 0
09/02/2022 16:16:35 - INFO - main - batch_size: 30
09/02/2022 16:16:35 - INFO - main - num_epochs: 1
09/02/2022 16:16:35 - INFO - main - train_num: -1
09/02/2022 16:16:35 - INFO - main - dev_num: -1
09/02/2022 16:16:35 - INFO - main - test_num: -1
09/02/2022 16:16:35 - INFO - main - max_no_incre: 80
09/02/2022 16:16:35 - INFO - main - max_grad_norm: 1.0
09/02/2022 16:16:35 - INFO - main - fp16: 1
09/02/2022 16:16:35 - INFO - main - model_folder: english_model
09/02/2022 16:16:35 - INFO - main - hidden_dim: 0
09/02/2022 16:16:35 - INFO - main - dropout: 0.5
09/02/2022 16:16:35 - INFO - main - embedder_type: roberta-base
09/02/2022 16:16:35 - INFO - main - add_iobes_constraint: 0
09/02/2022 16:16:35 - INFO - main - print_detail_f1: 0
09/02/2022 16:16:35 - INFO - main - earlystop_atr: micro
09/02/2022 16:16:35 - INFO - main - mode: train
09/02/2022 16:16:35 - INFO - main - test_file: data/conll2003/test.txt
Downloading builder script: 6.33kB [00:00, 2.49MB/s]
09/02/2022 16:16:45 - INFO - main - [Data Info] Tokenizing the instances using 'roberta-base' tokenizer
09/02/2022 16:16:55 - INFO - main - [Data Info] Reading dataset from:
data/conll2003/train.txt
data/conll2003/dev.txt
data/conll2003/test.txt
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/train.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████████| 300/300 [00:00<00:00, 855980.41it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 14
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Using the training set to build label index
09/02/2022 16:16:55 - INFO - src.data.data_utils - #labels: 16
09/02/2022 16:16:55 - INFO - src.data.data_utils - label 2idx: {'
You have a Label 'B-LOC' that does not exist in your training set
feel free to reopen the issue
Q1: Can you tell me how to set the appropriate {YOUR_OTHER_ARGUMENTS} in this command: accelerate launch transformers_trainer_ddp.py --batch_size=30 {YOUR_OTHER_ARGUMENTS}.
Q2:when I run this command: python trainer.py --embedder_type=bert-large-cased, an error occurred: Traceback (most recent call last): File "trainer.py", line 12, in
from src.config import context_models, get_metric
ImportError: cannot import name 'context_models' from 'src.config' (/home/zhang/compatibility_analysis/pytorch_neural_crf/src/config/init.py)
Can you help me fix this issue?