KaijuML / dtt-multi-branch

Code for Controlling Hallucinations at Word Level in Data-to-Text Generation (C. Rebuffel, M. Roberti, L. Soulier, G. Scoutheeten, R. Cancelliere, P. Gallinari)
https://arxiv.org/abs/2102.02810
Other
17 stars 2 forks source link

About loading models in pos_tagging.py #6

Open tomohirok27 opened 3 years ago

tomohirok27 commented 3 years ago

I have a question about the implementation of Part-of-Speech tagging. The following command will tag the POS.

python3 pos_tagging.py --do_train --do_tagging train --gpus 0 1 --dataset_folder wikibio

--do_train will load the pre-trained model bert-base-uncased, perform fine tuning, and save the model in ./pos/trained. But why does --do_tagging load --model_name_or_path bert-base-uncased in def run_scriptinstead of loading the stored model in ./pos/trained ?

cmd = " ".join([
        f'CUDA_VISIBLE_DEVICES={gpus}',
        'python run_ner.py',
        f'--data_dir {pos_folder}/',
        '--model_type bert',
        f'--labels {os.path.join(pos_folder, "labels.txt")}',
        '--model_name_or_path bert-base-uncased',
        f'--output_dir {os.path.join(pos_folder, "trained")}',
        f'--max_seq_length {max_seq_length}',
        '--do_predict',
        '--per_gpu_eval_batch_size 64'
    ])
zhangzhang827 commented 1 year ago

I have a question about the implementation of Part-of-Speech tagging. The following command will tag the POS.

python3 pos_tagging.py --do_train --do_tagging train --gpus 0 1 --dataset_folder wikibio

--do_train will load the pre-trained model bert-base-uncased, perform fine tuning, and save the model in ./pos/trained. But why does --do_tagging load --model_name_or_path bert-base-uncased in def run_scriptinstead of loading the stored model in ./pos/trained ?

cmd = " ".join([
        f'CUDA_VISIBLE_DEVICES={gpus}',
        'python run_ner.py',
        f'--data_dir {pos_folder}/',
        '--model_type bert',
        f'--labels {os.path.join(pos_folder, "labels.txt")}',
        '--model_name_or_path bert-base-uncased',
        f'--output_dir {os.path.join(pos_folder, "trained")}',
        f'--max_seq_length {max_seq_length}',
        '--do_predict',
        '--per_gpu_eval_batch_size 64'
    ])

这个代码我也遇到了一些问题,请问您现在跑通了嘛?