boostcampaitech2 / klue-level2-nlp-11

A solution for KLUE Relation Extraction Competition in the 2nd BoostCamp AI Tech by team AI-ESG
4 stars 6 forks source link

KLUE - Relation Extraction

A solution for KLUE Relation Extraction Competition in the 2nd BoostCamp AI Tech by team AI-ESG

Content

Background - RE(Relation Extraction) Tasks

Relation extraction task predicts attributes and relations between entities in sentence

This is an example:

sentence: 오라클(구 썬 마이크로시스템즈)에서 제공하는 자바 가상 머신 말고도 각 운영 체제 개발사가 제공하는 자바 가상 머신 및 오픈소스로 개발된 구형 버전의 온전한 자바 VM도 있으며, GNU의 GCJ나 아파치 소프트웨어 재단(ASF: Apache Software Foundation)의 하모니(Harmony)와 같은 아직은 완전하지 않지만 지속적인 오픈 소스 자바 가상 머신도 존재한다. subject-entity: 썬 마이크로시스템즈 object-entity: 오라클 relation: 단체:별칭 (org:alternatenames)

Project Outline

Team

Members of Team AI-ESG

Name github contact
문석암 Link mon823@naver.com
박마루찬 Link shaild098@naver.com
박아멘 Link puzzlistpam@gmail.com
우원진 Link dndnjswls613@naver.com
윤영훈 Link wodlxosxos73@gmail.com
장동건 Link jdg4661@gmail.com
홍현승 Link honghyunseung100@gmail.com

Structure

├── code
│   ├── best_model
│   ├── dict_label_to_num.pkl
│   ├── dict_num_to_label.pkl
│   ├── ensemble.py
│   ├── inference.py
│   ├── load_data.py
│   ├── loss.py
│   ├── models.py
│   ├── prediction
│   │   └── sample_submission.csv
│   ├── requirements.txt
│   ├── results
│   ├── trainer.py
│   └── train.py
└── dataset
    ├── test
    │   └── test_data.csv
    └── train
        └── train.csv

Getting Started

Hardware

Dependencies

Install Requirements

pip install -r requirements.txt

Train

You can train our model with train.py It includes various arguments, which can be set.

This is an example:

python train.py --run_name NAME --train_batch_size 32 \
        --num_train_epochs 10 --learning_rate 5e-5 --warmup_steps 0 --weight_decay 0.01 \
        --output_dir ./results/NAME --random_seed 452

or run following shell script file. $ ./run_train.sh

Inference

python inference.py --model_name klue/roberta-large --model_dir ./best_model

Ensemble

python ensemble.py --csv_name output1.csv,output2.csv --csv_dir ./prediction \
                --save_path ./prediction/ensemble.csv

TATP

Generate Text

This is an example:

python ./model/mk_text.py

Train TATP

This is an example:

python ./model/maskedml_for_tatp.py \
        --model_name_or_path [klue/roberta-large] --run_name [NAME] \
        --do_train --output_dir [model path]