boostcampaitech2 / mrc-level2-nlp-04

mrc-level2-nlp-04 created by GitHub Classroom
4 stars 5 forks source link

[TODO] Roberta + CNN + Attention model 구현 및 random_concat & wiki 적당한 길이로 잘라서 데이터 만들기 #42

Open raki-1203 opened 3 years ago

raki-1203 commented 3 years ago

결과

1기에 사용한 모델 거의 그대로 사용해봤는데 성적이 좋아지질 않네요 데이터셋 자체에서 좀 걸러지는게 있어야 성능이 좋아질지는 테스트 해봐야겠습니다.

model_name additional_layer batch_size eval/exact_match LB/exact_match
klue/roberta-small Convolution 128 51.25 안내봄
klue/roberta-large basic 128 65.00 61.670
klue/roberta-large Convolution 128 66.667 57.500
klue/roberta-large Convolution 160 69.167 57.080
klue/roberta-large QAConvolution 128 58.75 56.250
klue/roberta-large QAConvolution_ver2 128 코드를 잘못 사용했는지 exact_match score 가 오르지 않음
klue/roberta-large basic-random_concat 128 65.00 59.580

roberta-small with convolution batch 128


python train.py 
--do_train 
--project_name mrc_concat_data_train 
--model_name_or_path klue/roberta-small 
--run_name roberta-small_cnn_batch_128_concat_5 
--with_inference False 
--dataset_name concat 
--per_device_train_batch_size 16 
--gradient_accumulation_steps 8 
--num_train_epochs 20 
--additional_model convolution

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-small_cnn_batch_128_concat_5 --run_name roberta-small_cnn_batch_128_concat_5 --elastic_index_name preprocess-wiki-index --additional_model convolution


> roberta-large with convolution batch 128

python train.py --do_train --project_name mrc_concat_data_train --model_name_or_path klue/roberta-large --run_name roberta-large_cnn_batch_128_concat_5 --with_inference False --dataset_name concat --per_device_train_batch_size 16 --gradient_accumulation_steps 8
--num_train_epochs 20 --additional_model convolution

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-large_cnn_batch_128_concat_5 --run_name roberta-large_cnn_batch_128_concat_5 --elastic_index_name preprocess-wiki-index --additional_model convolution


> roberta-large with convolution batch 160

python train.py --do_train --project_name mrc_concat_data_train --model_name_or_path klue/roberta-large --run_name roberta-large_cnn_batch_160_concat_5 --with_inference False --dataset_name concat --per_device_train_batch_size 16 --gradient_accumulation_steps 10
--num_train_epochs 20 --additional_model convolution

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-large_cnn_batch_160_concat_5 --run_name roberta-large_cnn_batch_160_concat_5 --elastic_index_name preprocess-wiki-index --additional_model convolution


> roberta-large with QAconvolution batch 128

python train.py --do_train --project_name mrc_concat_data_train --model_name_or_path klue/roberta-large --run_name roberta-large_qacnn_batch_128_concat_5 --with_inference False --dataset_name concat --per_device_train_batch_size 16 --gradient_accumulation_steps 8 --num_train_epochs 20 --additional_model qa_conv

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-large_qacnn_batch_128_concat_5 --run_name roberta-large_qacnn_batch_128_concat_5 --elastic_index_name preprocess-wiki-index --additional_model qa_conv


> roberta-large with QAconvolution_ver2 batch 128

python train.py --do_train --project_name mrc_concat_data_train --model_name_or_path klue/roberta-large --run_name roberta-large_qacnn2_batch_128_concat_5 --with_inference False --dataset_name concat --per_device_train_batch_size 16 --gradient_accumulation_steps 8 --num_train_epochs 20 --additional_model qa_conv_ver2

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-large_qacnn2_batch_128_concat_5 --run_name roberta-large_qacnn2_batch_128_concat_5 --elastic_index_name preprocess-wiki-index --additional_model qa_conv_ver2


> roberta-large random_concat batch 128

python train.py --do_train --project_name mrc_concat_data_train --model_name_or_path klue/roberta-large --run_name roberta-large_batch_128_random_concat --with_inference False --dataset_name random_concat --per_device_train_batch_size 16 --gradient_accumulation_steps 8 --num_train_epochs 20

python inference.py --do_predict --project_name mrc_concat_data_train --finetuned_mrc_model_path ../output/mrc_concat_data_train/roberta-large_batch_128_random_concat --run_name roberta-large_batch_128_random_concat
--elastic_index_name preprocess-wiki-index

sangmandu commented 3 years ago

고생하셨습니다.

  1. klue/roberta-large | 5 | 128 | 65.00 | 61.670 부분도 추가해주시면 비교가 원활할 것 같습니다.